An r/ClaudeAI post launched PullMD: an MCP server that converts HTML to markdown so Claude Code does not burn tokens parsing raw HTML. The thread hit 275 upvotes. Five MCP-based HTML extractors ranked for 2026.
Scavio's /extract endpoint returns markdown directly via the hosted MCP server at mcp.scavio.dev/mcp. PullMD is a focused single-purpose alternative for self-hosted setups.
Full Ranking
Scavio MCP (extract endpoint)
Hosted MCP with extract built in
- Hosted, no infra
- Markdown output
- Multi-platform under one MCP
- Per-credit cost on heavy use
PullMD
Self-hosted Claude Code teams
- Free
- Single-purpose
- You run the server
Firecrawl MCP
Large-scale extraction
- High concurrency
- Pricey at small scale
Webcrawl-MCP (community)
Community-maintained extract
- Free
- Less polished
Browserbase Fetch + MCP
When the page needs a real browser
- Works on JS-only pages
- Browser-hour billing compounds
Side-by-Side Comparison
| Criteria | Scavio | Runner-up | 3rd Place |
|---|---|---|---|
| Hosted MCP | Yes | Self-hosted | Hosted (paid) |
| Markdown output | Yes | Yes | Yes |
| Cost per extract | 1 credit ($0.0043) | Free + infra | $0.0008-0.005 |
| Multi-surface (search + extract) | Yes | Extract only | Both |
| Best for | All-in-one MCP | OSS Claude users | High volume |
Why Scavio Wins
- PullMD solves exactly the right problem: feeding Claude Code raw HTML burns tokens. The fix is a tool that returns markdown. Scavio's /extract endpoint does the same thing and ships under the same MCP server that handles search, so a Claude Code skill attaches one MCP and gets both surfaces.
- Honest tradeoff: PullMD is free and OSS. For a solo developer who already has a server running, $0/mo beats $30/mo. The decision tree: if you'd pay $30/mo for hosted multi-platform anyway, the extract endpoint comes free. If extract is your only need, PullMD is right.
- Token math behind the post: a 60KB HTML page is ~30K tokens raw. The same page as markdown is ~3K tokens. On Claude Sonnet 4.6 at $3/MTok input, that is $0.09 vs $0.009 per page — a 10x cut.
- Hosted MCP is operational discipline. mcp.scavio.dev/mcp has uptime monitoring and a documented schema. Self-hosting PullMD means you own the uptime and the upgrade path.
- MCP routing pattern: a Claude Code agent attaches Scavio MCP for search + extract on indexed targets and Browserbase MCP only when the target requires a real browser. The agent picks per query, not per stack.