Overview
Wraps search API calls in a token budget manager that truncates, summarizes, and filters results before passing them to the LLM context window. Reduces token consumption by 40-60% without losing critical information.
Trigger
On every search-augmented LLM call
Schedule
On every search-augmented LLM call
Workflow Steps
Set token budget
Define the maximum tokens allocated for search context (e.g., 2000 tokens out of a 4000-token context window).
Execute search query
Query Scavio's search endpoint for the user's question. Receive full structured results.
Extract essential fields
From each result, keep only title, snippet, and URL. Discard metadata, thumbnails, and other non-essential fields.
Truncate to budget
Estimate token count per result. Include results until the budget is reached. Truncate the last result if needed.
Format for LLM context
Format the budgeted results as a numbered list with clear source attribution for the LLM to reference.
Pass to LLM with remaining budget
Include the formatted search context in the LLM prompt, leaving the remaining token budget for the model's response.
Python Implementation
import requests, os
H = {"x-api-key": os.environ["SCAVIO_API_KEY"]}
def budgeted_search(query, platform="google", max_tokens=2000):
r = requests.post("https://api.scavio.dev/api/v1/search", headers=H,
json={"platform": platform, "query": query}, timeout=10).json()
context = []
token_count = 0
for result in r.get("organic", []):
title = result.get("title", "")
snippet = result.get("snippet", "")
url = result.get("link", "")
entry = f"- {title}: {snippet} ({url})"
est_tokens = len(entry) // 4
if token_count + est_tokens > max_tokens:
break
context.append(entry)
token_count += est_tokens
return "\n".join(context), token_count
context, tokens_used = budgeted_search("best search api for agents 2026")
print(f"Used ~{tokens_used} tokens for search context")
print(context)JavaScript Implementation
const H = {"x-api-key": process.env.SCAVIO_API_KEY, "Content-Type": "application/json"};
async function budgetedSearch(query, platform = "google", maxTokens = 2000) {
const r = await fetch("https://api.scavio.dev/api/v1/search", {
method: "POST", headers: H,
body: JSON.stringify({platform, query})
}).then(r => r.json());
const context = [];
let tokenCount = 0;
for (const result of (r.organic || [])) {
const entry = "- " + result.title + ": " + result.snippet + " (" + result.link + ")";
const est = Math.ceil(entry.length / 4);
if (tokenCount + est > maxTokens) break;
context.push(entry);
tokenCount += est;
}
return {context: context.join("\n"), tokensUsed: tokenCount};
}Platforms Used
Web search with knowledge graph, PAA, and AI overviews
Community, posts & threaded comments from any subreddit
YouTube
Video search with transcripts and metadata
Amazon
Product search with prices, ratings, and reviews