How long does this reduce agent search token count tutorial take?

Most developers complete this tutorial in 15 to 30 minutes. You will need a Scavio API key (free tier works) and a working Python or JavaScript environment.

What do I need before starting?

Python 3.8+ installed. requests library installed. A Scavio API key from scavio.dev. An LLM agent that uses search tools. A Scavio API key gives you 50 free credits on signup.

Can I run this tutorial with the free tier?

Yes. The free tier includes 50 credits on signup, which is more than enough to complete this tutorial and prototype a working solution.

What frameworks does this work with?

Scavio has a native LangChain package (langchain-scavio), an MCP server, and a plain REST API that works with any HTTP client. This tutorial uses the raw REST API, but you can adapt to your framework of choice.

Reduce Agent Search Token Count (2026)

LLM agents that call web search tools often consume excessive tokens because raw search results contain titles, snippets, URLs, metadata, and SERP features that the agent does not need. Passing full search responses into an agent context window wastes tokens and money. This tutorial shows how to compress search results by extracting only the fields the agent needs, truncating snippets, deduplicating content, and formatting results as compact text. You will build a search compression layer that reduces token count by 60-80% while keeping the information density high.

Prerequisites

Python 3.8+ installed
requests library installed
A Scavio API key from scavio.dev
An LLM agent that uses search tools

Walkthrough

Step 1: Fetch raw search results

Query the Scavio API and measure the raw token count of the full response.

Python

import os, requests, json

API_KEY = os.environ["SCAVIO_API_KEY"]
resp = requests.post("https://api.scavio.dev/api/v1/search",
    headers={"x-api-key": API_KEY},
    json={"platform": "google", "query": "best CRM for startups 2026"})
raw = resp.json()
raw_size = len(json.dumps(raw))
print(f"Raw response: {raw_size} chars")

Step 2: Extract essential fields only

Strip the response down to only the fields an agent needs: title, snippet, and URL.

Python

def compress_results(data, max_results=5):
    results = []
    for r in data.get("organic_results", [])[:max_results]:
        results.append({
            "title": r.get("title", "")[:80],
            "snippet": r.get("snippet", "")[:200],
            "url": r.get("link", ""),
        })
    return results

compressed = compress_results(raw)
comp_size = len(json.dumps(compressed))
print(f"Compressed: {comp_size} chars ({100 - round(comp_size/raw_size*100)}% reduction)")

Step 3: Format as compact text for agent context

Convert structured results to a minimal text format that uses fewer tokens than JSON.

Python

def format_for_agent(results):
    lines = []
    for i, r in enumerate(results, 1):
        lines.append(f"[{i}] {r['title']}")
        lines.append(f"    {r['snippet']}")
        lines.append(f"    {r['url']}")
    return "\n".join(lines)

agent_text = format_for_agent(compressed)
print(f"Agent text: {len(agent_text)} chars")
print(agent_text[:500])

Step 4: Deduplicate overlapping results

Remove near-duplicate results that waste agent context with redundant information.

Python

def deduplicate(results):
    seen_domains = set()
    unique = []
    for r in results:
        from urllib.parse import urlparse
        domain = urlparse(r["url"]).netloc
        if domain not in seen_domains:
            seen_domains.add(domain)
            unique.append(r)
    return unique

deduped = deduplicate(compressed)
print(f"After dedup: {len(deduped)} results (was {len(compressed)})")

Step 5: Build the compression wrapper

Combine all compression steps into a single function that replaces the raw search call in your agent.

Python

def agent_search(query, max_results=5):
    resp = requests.post("https://api.scavio.dev/api/v1/search",
        headers={"x-api-key": API_KEY},
        json={"platform": "google", "query": query})
    compressed = compress_results(resp.json(), max_results)
    deduped = deduplicate(compressed)
    return format_for_agent(deduped)

result = agent_search("best CRM for startups 2026")
print(f"Final token-efficient output: {len(result)} chars")

Python Example

Python

import os, requests, json
API_KEY = os.environ["SCAVIO_API_KEY"]
resp = requests.post("https://api.scavio.dev/api/v1/search",
    headers={"x-api-key": API_KEY},
    json={"platform": "google", "query": "best CRM for startups 2026"})
results = resp.json().get("organic_results", [])[:5]
for r in results:
    print(f"{r['title'][:80]}\n  {r.get('snippet', '')[:150]}")

JavaScript Example

JavaScript

const r = await fetch("https://api.scavio.dev/api/v1/search", {
  method: "POST",
  headers: {"x-api-key": process.env.SCAVIO_API_KEY, "Content-Type": "application/json"},
  body: JSON.stringify({platform: "google", query: "best CRM for startups 2026"})
});
const data = await r.json();
(data.organic_results || []).slice(0, 5).forEach(r =>
  console.log(r.title.slice(0, 80), "\n ", (r.snippet || "").slice(0, 150))
);

Expected Output

JSON

A compressed text representation of search results that uses 60-80% fewer tokens than the raw JSON response while preserving all information an agent needs.

Prerequisites

Python 3.8+ installed
requests library installed
A Scavio API key from scavio.dev
An LLM agent that uses search tools

Walkthrough

Step 1: Fetch raw search results

Query the Scavio API and measure the raw token count of the full response.

Python

import os, requests, json

API_KEY = os.environ["SCAVIO_API_KEY"]
resp = requests.post("https://api.scavio.dev/api/v1/search",
    headers={"x-api-key": API_KEY},
    json={"platform": "google", "query": "best CRM for startups 2026"})
raw = resp.json()
raw_size = len(json.dumps(raw))
print(f"Raw response: {raw_size} chars")

Step 2: Extract essential fields only

Strip the response down to only the fields an agent needs: title, snippet, and URL.

Python

def compress_results(data, max_results=5):
    results = []
    for r in data.get("organic_results", [])[:max_results]:
        results.append({
            "title": r.get("title", "")[:80],
            "snippet": r.get("snippet", "")[:200],
            "url": r.get("link", ""),
        })
    return results

compressed = compress_results(raw)
comp_size = len(json.dumps(compressed))
print(f"Compressed: {comp_size} chars ({100 - round(comp_size/raw_size*100)}% reduction)")

Step 3: Format as compact text for agent context

Convert structured results to a minimal text format that uses fewer tokens than JSON.

Python

def format_for_agent(results):
    lines = []
    for i, r in enumerate(results, 1):
        lines.append(f"[{i}] {r['title']}")
        lines.append(f"    {r['snippet']}")
        lines.append(f"    {r['url']}")
    return "\n".join(lines)

agent_text = format_for_agent(compressed)
print(f"Agent text: {len(agent_text)} chars")
print(agent_text[:500])

Step 4: Deduplicate overlapping results

Remove near-duplicate results that waste agent context with redundant information.

Python

def deduplicate(results):
    seen_domains = set()
    unique = []
    for r in results:
        from urllib.parse import urlparse
        domain = urlparse(r["url"]).netloc
        if domain not in seen_domains:
            seen_domains.add(domain)
            unique.append(r)
    return unique

deduped = deduplicate(compressed)
print(f"After dedup: {len(deduped)} results (was {len(compressed)})")

Step 5: Build the compression wrapper

Combine all compression steps into a single function that replaces the raw search call in your agent.

Python

def agent_search(query, max_results=5):
    resp = requests.post("https://api.scavio.dev/api/v1/search",
        headers={"x-api-key": API_KEY},
        json={"platform": "google", "query": query})
    compressed = compress_results(resp.json(), max_results)
    deduped = deduplicate(compressed)
    return format_for_agent(deduped)

result = agent_search("best CRM for startups 2026")
print(f"Final token-efficient output: {len(result)} chars")

Python Example

Python

import os, requests, json
API_KEY = os.environ["SCAVIO_API_KEY"]
resp = requests.post("https://api.scavio.dev/api/v1/search",
    headers={"x-api-key": API_KEY},
    json={"platform": "google", "query": "best CRM for startups 2026"})
results = resp.json().get("organic_results", [])[:5]
for r in results:
    print(f"{r['title'][:80]}\n  {r.get('snippet', '')[:150]}")

JavaScript Example

JavaScript

const r = await fetch("https://api.scavio.dev/api/v1/search", {
  method: "POST",
  headers: {"x-api-key": process.env.SCAVIO_API_KEY, "Content-Type": "application/json"},
  body: JSON.stringify({platform: "google", query: "best CRM for startups 2026"})
});
const data = await r.json();
(data.organic_results || []).slice(0, 5).forEach(r =>
  console.log(r.title.slice(0, 80), "\n ", (r.snippet || "").slice(0, 150))
);

Expected Output

JSON

A compressed text representation of search results that uses 60-80% fewer tokens than the raw JSON response while preserving all information an agent needs.

How to Reduce Agent Search Token Count

Prerequisites

Walkthrough

Step 1: Fetch raw search results

Step 2: Extract essential fields only

Step 3: Format as compact text for agent context

Step 4: Deduplicate overlapping results

Step 5: Build the compression wrapper

Python Example

JavaScript Example

Expected Output

Related Tutorials

Frequently Asked Questions

How long does this reduce agent search token count tutorial take?

What do I need before starting?

Can I run this tutorial with the free tier?

What frameworks does this work with?

Related Resources

Best Agent Token Optimization Tools in 2026

Agent Token Optimization

Best Web Search API for Local LLMs in 2026

Reduce Agent Search Tokens with Structured JSON

Token-Efficient Search Context for LLM Pipelines

Token-Efficient Web Search for AI Agents

Start Building

How to Reduce Agent Search Token Count

Prerequisites

Walkthrough

Step 1: Fetch raw search results

Step 2: Extract essential fields only

Step 3: Format as compact text for agent context

Step 4: Deduplicate overlapping results

Step 5: Build the compression wrapper

Python Example

JavaScript Example

Expected Output

Related Tutorials

Frequently Asked Questions

How long does this reduce agent search token count tutorial take?

What do I need before starting?

Can I run this tutorial with the free tier?

What frameworks does this work with?

Related Resources

Best Agent Token Optimization Tools in 2026

Agent Token Optimization

Best Web Search API for Local LLMs in 2026

Reduce Agent Search Tokens with Structured JSON

Token-Efficient Search Context for LLM Pipelines

Token-Efficient Web Search for AI Agents

Start Building