extractsearch-apiagents

Extract Endpoint: The Missing Piece of Search APIs

By 2026, extract endpoints are table stakes. Vendor-stitching SERP plus a separate extract service is operational debt.

April 29, 2026

5 min read

Most search APIs in 2024 returned snippets only. Agents that needed full-page content fetched the URL themselves and converted HTML to markdown. By 2026, an extract endpoint became table stakes; the missing-piece-of-search-APIs problem is solved.

What an extract endpoint is

A method that takes a URL and returns the page's content as clean markdown (or structured JSON). No HTML boilerplate. No script tags. No navigation markup. The agent feeds the markdown directly to its LLM.

Python

import os, requests

def extract(url):
    return requests.post(
        'https://api.scavio.dev/api/v1/extract',
        headers={'x-api-key': os.environ['SCAVIO_API_KEY']},
        json={'url': url, 'format': 'markdown'},
    ).json()

# Returns: { 'url', 'markdown', 'title' }

The token math

Raw HTML for a 60KB page averages ~30K tokens. The same page as markdown averages ~3K. At Claude Sonnet 4.6 input rates, that is $0.09 vs $0.009 per page. For an agent processing 100 pages a day, the difference is $9/day vs $0.90/day in input-token spend.

What different vendors offer

Scavio /extract: $0.0043/credit, hosted MCP server includes extract as one of six tools.
Tavily Extract: $0.008/credit, pre-summarizes rather than returning full markdown.
Firecrawl Scrape: per-credit, optimized for full-site crawls more than per-URL extracts.
Jina Reader: free + paid, single-purpose markdown via URL prefix.
PullMD: OSS, self-hosted MCP server for the Claude Code use case specifically.

Common usage patterns

RAG retrieval. Search returns 5 candidates. Extract pulls full markdown for the 1-2 most promising. LLM produces grounded answer with citations.

Article-to-social. Extract pulls the article markdown. LLM transforms into 280-char tweet or LinkedIn post.

Compliance monitoring. Search returns regulatory news URLs. Extract pulls full text. LLM tags risk level.

What extract does not do

Auth-gated pages. JS-only SPAs. Pages that require user interaction to render content. For those, real browser tooling (Browserbase, Stagehand, Playwright) is the answer. The honest decision tree: if the page renders for an unauthenticated curl request, extract works; if it requires JavaScript or login, it does not.

Caching extracts

Extract responses are stable for a given URL over hours-to-days timescales. TTL of 24 hours on the cache layer cuts repeat-URL cost dramatically. RAG pipelines that re-process the same documentation pages benefit substantially.

The MCP packaging

Hosted MCP servers ship extract as one of several tools. Scavio's MCP at mcp.scavio.dev/mcp includes search, reddit_search, youtube_search, amazon_search, walmart_search, and extract. One config command attaches all six. PullMD ships extract alone for teams that want only that.

The honest tradeoff

Extract on hosted MCP at $0.0043/call costs slightly more than self-hosted PullMD at $0/call plus your infra. The decision is operational preference. For teams that already run servers, self-hosted wins on cost. For teams that prefer hosted services, the marginal cost is small and the operational simplicity matters.

What this means for new builders

Pick a search API that has extract built in. Vendor-stitching a SERP API plus a separate extract service was the 2024 pattern; it is operational debt by 2026. One vendor for both keeps the agent code clean and the credentials simple.