Tutorial

How to Stop Burning Claude Code Tokens on HTML Parsing

An r/ClaudeAI post launched PullMD to fix HTML token bloat. Same pattern with Scavio's hosted /extract endpoint and one MCP attach.

An r/ClaudeAI post launched PullMD to fix HTML token bloat in Claude Code. The same fix runs through Scavio's /extract endpoint with no infra. This tutorial walks the swap.

Prerequisites

  • Claude Code installed
  • Scavio API key

Walkthrough

Step 1: Identify token-burning HTML in current agent

Look for tool calls that fetch raw HTML and pass it to the LLM.

Python
# Before:
# fetch(url) -> raw HTML -> LLM context
# 60KB HTML = ~30K tokens

Step 2: Attach Scavio MCP

One config line in Claude Code.

Bash
claude mcp add scavio https://mcp.scavio.dev/mcp --header "x-api-key: $SCAVIO_API_KEY"

Step 3: Replace fetch tool with extract MCP tool

Agent calls extract(url) instead of fetch.

Python
# Agent prompt now uses extract tool:
# 'Use extract to read the markdown of $URL'
# Returns ~3K tokens of clean markdown.

Step 4: Run before/after token count

Compare token usage on the same task.

Text
# Before: 30K input tokens / call
# After:  3K input tokens / call
# 10x reduction at the input layer.

Step 5: Decide on per-call cost

Scavio extract is 1 credit / call. PullMD self-hosted is free + your infra.

Text
# Scavio: $0.0043/call hosted
# PullMD: $0/call + server you maintain
# Pick based on infra preference.

Python Example

Python
# Direct API alternative if not using MCP:
import os, requests
resp = requests.post('https://api.scavio.dev/api/v1/extract',
    headers={'x-api-key': os.environ['SCAVIO_API_KEY']},
    json={'url': url, 'format': 'markdown'}).json()
markdown = resp.get('markdown', '')

JavaScript Example

JavaScript
// Same shape in TS.
const resp = await fetch('https://api.scavio.dev/api/v1/extract', {
  method: 'POST',
  headers: { 'x-api-key': process.env.SCAVIO_API_KEY, 'Content-Type': 'application/json' },
  body: JSON.stringify({ url, format: 'markdown' })
}).then(r => r.json());

Expected Output

JSON
Claude Code agent's HTML-related tool calls drop from ~30K input tokens to ~3K. Per-task LLM cost drops accordingly.

Related Tutorials

Frequently Asked Questions

Most developers complete this tutorial in 15 to 30 minutes. You will need a Scavio API key (free tier works) and a working Python or JavaScript environment.

Claude Code installed. Scavio API key. A Scavio API key gives you 500 free credits per month.

Yes. The free tier includes 500 credits per month, which is more than enough to complete this tutorial and prototype a working solution.

Scavio has a native LangChain package (langchain-scavio), an MCP server, and a plain REST API that works with any HTTP client. This tutorial uses the raw REST API, but you can adapt to your framework of choice.

Start Building

An r/ClaudeAI post launched PullMD to fix HTML token bloat. Same pattern with Scavio's hosted /extract endpoint and one MCP attach.