Workflow

Token-Budgeted Search for Cost-Efficient Agents

Reduce AI agent costs by implementing token budgets for search results. Truncate and summarize search data before feeding it to the LLM.

Overview

Wraps search API calls in a token budget manager that truncates, summarizes, and filters results before passing them to the LLM context window. Reduces token consumption by 40-60% without losing critical information.

Trigger

On every search-augmented LLM call

Schedule

On every search-augmented LLM call

Workflow Steps

1

Set token budget

Define the maximum tokens allocated for search context (e.g., 2000 tokens out of a 4000-token context window).

2

Execute search query

Query Scavio's search endpoint for the user's question. Receive full structured results.

3

Extract essential fields

From each result, keep only title, snippet, and URL. Discard metadata, thumbnails, and other non-essential fields.

4

Truncate to budget

Estimate token count per result. Include results until the budget is reached. Truncate the last result if needed.

5

Format for LLM context

Format the budgeted results as a numbered list with clear source attribution for the LLM to reference.

6

Pass to LLM with remaining budget

Include the formatted search context in the LLM prompt, leaving the remaining token budget for the model's response.

Python Implementation

Python
import requests, os

H = {"x-api-key": os.environ["SCAVIO_API_KEY"]}

def budgeted_search(query, platform="google", max_tokens=2000):
    r = requests.post("https://api.scavio.dev/api/v1/search", headers=H,
        json={"platform": platform, "query": query}, timeout=10).json()
    context = []
    token_count = 0
    for result in r.get("organic", []):
        title = result.get("title", "")
        snippet = result.get("snippet", "")
        url = result.get("link", "")
        entry = f"- {title}: {snippet} ({url})"
        est_tokens = len(entry) // 4
        if token_count + est_tokens > max_tokens:
            break
        context.append(entry)
        token_count += est_tokens
    return "\n".join(context), token_count

context, tokens_used = budgeted_search("best search api for agents 2026")
print(f"Used ~{tokens_used} tokens for search context")
print(context)

JavaScript Implementation

JavaScript
const H = {"x-api-key": process.env.SCAVIO_API_KEY, "Content-Type": "application/json"};

async function budgetedSearch(query, platform = "google", maxTokens = 2000) {
  const r = await fetch("https://api.scavio.dev/api/v1/search", {
    method: "POST", headers: H,
    body: JSON.stringify({platform, query})
  }).then(r => r.json());
  const context = [];
  let tokenCount = 0;
  for (const result of (r.organic || [])) {
    const entry = "- " + result.title + ": " + result.snippet + " (" + result.link + ")";
    const est = Math.ceil(entry.length / 4);
    if (tokenCount + est > maxTokens) break;
    context.push(entry);
    tokenCount += est;
  }
  return {context: context.join("\n"), tokensUsed: tokenCount};
}

Platforms Used

Google

Web search with knowledge graph, PAA, and AI overviews

Reddit

Community, posts & threaded comments from any subreddit

YouTube

Video search with transcripts and metadata

Amazon

Product search with prices, ratings, and reviews

Frequently Asked Questions

Wraps search API calls in a token budget manager that truncates, summarizes, and filters results before passing them to the LLM context window. Reduces token consumption by 40-60% without losing critical information.

This workflow uses a on every search-augmented llm call. On every search-augmented LLM call.

This workflow uses the following Scavio platforms: google, reddit, youtube, amazon. Each platform is called via the same unified API endpoint.

Yes. Scavio's free tier includes 500 credits per month with no credit card required. That is enough to test and validate this workflow before scaling it.

Token-Budgeted Search for Cost-Efficient Agents

Reduce AI agent costs by implementing token budgets for search results. Truncate and summarize search data before feeding it to the LLM.