mcppullmdcomparison

Scavio vs PullMD: Hosted vs Self-Hosted MCP

Both solve the same HTML-token problem. The decision is operational preference: hosted multi-surface or self-hosted single-purpose.

4 min read

An r/ClaudeAI post launched PullMD: a self-hosted MCP server that converts HTML to markdown. Scavio's hosted MCP atmcp.scavio.dev/mcp ships an extract tool that does the same job plus five other surfaces. The decision between them is operational preference more than capability.

The two paths

Self-hosted (PullMD). $0/mo. You run the server. The tool is single-purpose: HTML in, markdown out. OSS code, community support.

Hosted (Scavio MCP). $30/mo for 7,000 credits across all six tools (search, reddit_search, youtube_search, amazon_search, walmart_search, extract). Free 500 credits/mo. Scavio runs the server.

When self-hosted wins

  • You already operate servers and want $0 in MCP cost.
  • You only need extract (no search, no Reddit, no YouTube).
  • You prefer OSS code over vendor dependency.
  • You want full control over the extraction logic and can modify the source.

When hosted wins

  • You want extract plus other surfaces under one MCP attach.
  • You prefer not to operate servers.
  • You need uptime SLA and documented schema rather than community-maintained code.
  • Per-credit cost is fine for your scale.

The setup

Bash
# Hosted Scavio MCP
claude mcp add scavio https://mcp.scavio.dev/mcp \
  --header "x-api-key: $SCAVIO_API_KEY"

# Self-hosted PullMD
git clone https://github.com/.../pullmd
cd pullmd && npm install
claude mcp add pullmd node /path/to/pullmd/server.js

What the agent sees

With Scavio MCP attached: six named tools (search, reddit, youtube, amazon, walmart, extract). With PullMD attached: one named tool (typically pullmd_extract). The agent uses each based on the routing prompt.

Both at once

Some teams attach both. PullMD as the primary extractor (free, fast on hot infra) and Scavio MCP for search and multi-surface tools. The routing prompt picks per query: extract goes to PullMD, everything else goes to Scavio. Cost: ~$30/mo of Scavio for the search side, $0 for extract.

The token-cost framing

Both solve the same root problem: feeding raw HTML to LLMs costs ~10x more in tokens than markdown. The PullMD post crystallized this in the r/ClaudeAI community thread (275 upvotes, 27 comments). Either solution gets the savings; the comparison is about which delivery model fits your team.

Honest tradeoff

Self-hosted means you own uptime, scaling, and upgrade decisions. Hosted means you delegate those to the vendor but pay per use. For teams of 1-2, the operational cost of self-hosting (when it fails Friday at 11pm) usually outweighs the $30/mo. For teams of 10+ with a platform engineer, self-hosting is rounding error.

What changes if PullMD adds search

The decision matrix shifts. If PullMD ships a SERP tool in a future release, the "multi-surface from one server" argument for Scavio weakens. Watch the OSS roadmap. For now, Scavio is the only hosted MCP that ships six surfaces under one attach.

What ships in a day

Either path. Hosted Scavio is one config line. Self-hosted PullMD is a clone, an npm install, and a Claude Code mcp add. Both produce the same agent-side experience: a tool that takes a URL and returns clean markdown.