An r/ClaudeAI post launched PullMD to fix HTML token bloat in Claude Code. The same fix runs through Scavio's /extract endpoint with no infra. This tutorial walks the swap.
Prerequisites
- Claude Code installed
- Scavio API key
Walkthrough
Step 1: Identify token-burning HTML in current agent
Look for tool calls that fetch raw HTML and pass it to the LLM.
# Before:
# fetch(url) -> raw HTML -> LLM context
# 60KB HTML = ~30K tokensStep 2: Attach Scavio MCP
One config line in Claude Code.
claude mcp add scavio https://mcp.scavio.dev/mcp --header "x-api-key: $SCAVIO_API_KEY"Step 3: Replace fetch tool with extract MCP tool
Agent calls extract(url) instead of fetch.
# Agent prompt now uses extract tool:
# 'Use extract to read the markdown of $URL'
# Returns ~3K tokens of clean markdown.Step 4: Run before/after token count
Compare token usage on the same task.
# Before: 30K input tokens / call
# After: 3K input tokens / call
# 10x reduction at the input layer.Step 5: Decide on per-call cost
Scavio extract is 1 credit / call. PullMD self-hosted is free + your infra.
# Scavio: $0.0043/call hosted
# PullMD: $0/call + server you maintain
# Pick based on infra preference.Python Example
# Direct API alternative if not using MCP:
import os, requests
resp = requests.post('https://api.scavio.dev/api/v1/extract',
headers={'x-api-key': os.environ['SCAVIO_API_KEY']},
json={'url': url, 'format': 'markdown'}).json()
markdown = resp.get('markdown', '')JavaScript Example
// Same shape in TS.
const resp = await fetch('https://api.scavio.dev/api/v1/extract', {
method: 'POST',
headers: { 'x-api-key': process.env.SCAVIO_API_KEY, 'Content-Type': 'application/json' },
body: JSON.stringify({ url, format: 'markdown' })
}).then(r => r.json());Expected Output
Claude Code agent's HTML-related tool calls drop from ~30K input tokens to ~3K. Per-task LLM cost drops accordingly.