An r/ClaudeAI post launched PullMD: an MCP server for HTML to markdown extraction. This tutorial walks two paths — hosted (Scavio MCP) and self-hosted (FastMCP wrapping Scavio extract).
Prerequisites
- Python 3.10+ for self-hosted
- Claude Code or any MCP client
Walkthrough
Step 1: Path A: Use Scavio's hosted MCP
Zero infra.
claude mcp add scavio https://mcp.scavio.dev/mcp --header "x-api-key: $SCAVIO_API_KEY"Step 2: Path B: Self-host with FastMCP
Install fastmcp.
pip install fastmcp requestsStep 3: Wrap Scavio extract
FastMCP server exposing extract tool.
import os, requests
from fastmcp import FastMCP
mcp = FastMCP('html-extractor')
@mcp.tool()
def extract(url: str) -> dict:
return requests.post('https://api.scavio.dev/api/v1/extract',
headers={'x-api-key': os.environ['SCAVIO_API_KEY']},
json={'url': url, 'format': 'markdown'}).json()
if __name__ == '__main__':
mcp.run()Step 4: Run locally
Listen on stdio or SSE.
python server.pyStep 5: Attach to Claude Code
Custom MCP config.
claude mcp add html-extractor python /path/to/server.pyPython Example
# Path A is hosted, simplest, $0.0043/extract.
# Path B is self-hosted, $0/extract apart from Scavio underneath.JavaScript Example
// Same in TS using @modelcontextprotocol/sdk.Expected Output
Claude Code agent has a clean extract tool that returns markdown for any URL. Token usage drops 10x versus passing raw HTML.