Solution

Reduce Agent Search Tokens with Structured JSON

AI agents injecting raw HTML from web searches consume 20,000-25,000 tokens per query. At LLM pricing ($3/M input tokens for Claude Sonnet), a search-heavy agent costs $7-8/day jus

The Problem

AI agents injecting raw HTML from web searches consume 20,000-25,000 tokens per query. At LLM pricing ($3/M input tokens for Claude Sonnet), a search-heavy agent costs $7-8/day just in input tokens. The HTML includes navigation, ads, scripts, and footers that provide zero value to the agent.

The Scavio Solution

Replace raw HTML scraping with a structured search API that returns only titles, snippets, and URLs as JSON. The same search results tokenize to 800-1,200 tokens instead of 20,000-25,000, cutting LLM costs by 90%+.

Before

Before optimization, a research agent made 100 daily searches. Each search injected 25K tokens of raw HTML. Total: 2.5M input tokens/day = $7.50/day in LLM costs ($225/month). Plus scraping infrastructure costs.

After

After switching to structured JSON, the same 100 searches inject 1,200 tokens each. Total: 120K tokens/day = $0.36/day + $0.50 API cost = $0.86/day ($26/month). Monthly savings: $199. The agent also processes results faster because there is less noise in the context.

Who It Is For

AI agent developers, LLM application builders, and teams optimizing agent operating costs by reducing token consumption on web search.

Key Benefits

  • Reduce from 25K to 1,200 tokens per search (95% reduction)
  • Monthly LLM cost drops from $225 to $26 for 100 daily searches
  • Structured JSON eliminates HTML parsing and cleaning logic
  • Consistent schema means agents can reliably extract fields
  • Multi-platform search without separate scraping infrastructure

Python Example

Python
import requests, os

H = {'x-api-key': os.environ['SCAVIO_API_KEY']}

def efficient_search(query, max_results=5):
    r = requests.post('https://api.scavio.dev/api/v1/search', headers=H,
        json={'platform': 'google', 'query': query}, timeout=10).json()
    results = r.get('organic_results', [])[:max_results]
    return [{'title': r['title'], 'snippet': r.get('snippet', ''),
             'url': r.get('link', '')} for r in results]

# ~200 tokens vs ~25,000 from raw HTML
context = efficient_search('best vector database 2026')
print(f'{len(context)} results, ~{sum(len(str(c)) for c in context)} chars')

JavaScript Example

JavaScript
const H = {'x-api-key': process.env.SCAVIO_API_KEY, 'Content-Type': 'application/json'};
async function efficientSearch(query) {
  const r = await fetch('https://api.scavio.dev/api/v1/search', {
    method: 'POST', headers: H,
    body: JSON.stringify({platform: 'google', query})
  }).then(r => r.json());
  return (r.organic_results || []).slice(0, 5).map(r => ({
    title: r.title, snippet: r.snippet || '', url: r.link || ''
  }));
}
efficientSearch('best vector database 2026').then(r => console.log(`${r.length} results`));

Platforms Used

Google

Web search with knowledge graph, PAA, and AI overviews

Reddit

Community, posts & threaded comments from any subreddit

Frequently Asked Questions

AI agents injecting raw HTML from web searches consume 20,000-25,000 tokens per query. At LLM pricing ($3/M input tokens for Claude Sonnet), a search-heavy agent costs $7-8/day just in input tokens. The HTML includes navigation, ads, scripts, and footers that provide zero value to the agent.

Replace raw HTML scraping with a structured search API that returns only titles, snippets, and URLs as JSON. The same search results tokenize to 800-1,200 tokens instead of 20,000-25,000, cutting LLM costs by 90%+.

AI agent developers, LLM application builders, and teams optimizing agent operating costs by reducing token consumption on web search.

Yes. Scavio's free tier includes 250 credits per month with no credit card required. That is enough to validate this solution in your workflow.

Reduce Agent Search Tokens with Structured JSON

Replace raw HTML scraping with a structured search API that returns only titles, snippets, and URLs as JSON. The same search results tokenize to 800-1,200 tokens instead of 20,000-