mcpclaude-codehtml

MCP HTML Extractor Stops the Token Bloat in 2026

PullMD on r/ClaudeAI and Scavio's hosted /extract solve the same root pain: feeding raw HTML to LLMs costs ~10x more than markdown.

5 min read

An r/ClaudeAI post launched PullMD this week: an MCP server that converts HTML to markdown so Claude Code does not burn tokens parsing raw HTML. The thread hit 275 upvotes and 27 comments because the pain is widespread. A 60KB HTML page averages roughly 30K input tokens. The same page as markdown averages roughly 3K. That is a 10x cut at the input layer.

The math behind the 10x

HTML is mostly boilerplate. Script tags, inline CSS, navigation markup, footers, ad placements, tracking pixels. Tokenizers count each separately. A typical news article page has 5-10 boilerplate bytes for every byte of actual content. Strip to semantic markdown and the token count tracks the words, not the wrapper.

Two ways to fix it

PullMD is one. It is OSS, single-purpose, and runs as your own MCP server. If you already host services, $0/mo plus your infra is the cheapest option.

Scavio's /extract endpoint is the other. It ships under the same hosted MCP server at mcp.scavio.dev/mcpthat already provides search, Reddit search, YouTube search, Amazon search, and Walmart search. One config command, six tools, no infrastructure.

Bash
claude mcp add scavio https://mcp.scavio.dev/mcp \
  --header "x-api-key: $SCAVIO_API_KEY"

Per-extract cost

Scavio's extract is 1 credit per call, $0.0043 on the $30/mo tier or free under the 500-credit/mo tier. PullMD is $0 plus whatever the server costs. For most teams the choice is about operational preference more than budget.

What changes downstream

The agent's LLM cost drops in proportion to the input-token cut. A Claude Sonnet 4.6 call that fed raw HTML at $3/MTok input was paying $0.09 per page. The same call on extracted markdown pays $0.009. For an agent processing 100 pages a day, that is the difference between $9 and $0.90 in input-token spend.

What does not change

Extraction works on indexed and public targets. Auth-gated dashboards, JS-only SPAs, and pages that require interaction still need a real browser via Browserbase or Stagehand. The right pattern: hosted extract MCP for the 80-90% of targets that do not need a browser, real browser MCP for the rest.

The bigger pattern

Token cost is the second-most-discussed pain in agent-builder threads in 2026, behind only tool-routing. Every cut at the input layer compounds because LLMs charge per call and agents loop. Patterns that drop input tokens 10x are not micro optimizations; they are the difference between an agent that ships and an agent that gets shut down for cost overruns.

Decision tree

  • Already running infra and want $0 in extraction cost: use PullMD.
  • Want hosted MCP plus search + Reddit + YouTube + Amazon + Walmart from one server: use Scavio MCP.
  • Need extraction on auth-gated or JS-only pages: use Browserbase Fetch or Stagehand, not either of these.

The PullMD post was not really about PullMD. It was about the pattern: stop feeding raw HTML to LLMs. Every agent built since the post launched will need to make that decision. Most will pick whichever option is closest to their existing stack.