Tutorial

How to Set Up Token Budgets for Search API Calls

Control how many tokens your agent spends on search results. Build a budget system that limits context window usage per tool call.

Agents calling search APIs can consume thousands of tokens per query if results are uncontrolled. A token budget system ensures search results never exceed a defined limit, keeping your agent's context window available for reasoning. This is especially important for agents with multiple tool calls per turn.

Prerequisites

  • Python 3.8+
  • tiktoken installed (pip install tiktoken)
  • A Scavio API key

Walkthrough

Step 1: Install tiktoken for token counting

Set up token counting to measure search result sizes.

Bash
pip install tiktoken

Step 2: Build a budgeted search function

Create a search function that respects a token budget.

Python
import requests, os, tiktoken

H = {'x-api-key': os.environ['SCAVIO_API_KEY'], 'Content-Type': 'application/json'}
enc = tiktoken.encoding_for_model('gpt-4')

def budgeted_search(query: str, platform: str = 'google', max_tokens: int = 300) -> str:
    resp = requests.post('https://api.scavio.dev/api/v1/search', headers=H,
        json={'platform': platform, 'query': query}, timeout=10)
    results = resp.json().get('organic', [])
    
    output_lines = []
    token_count = 0
    
    for r in results:
        line = f"{r.get('title','')}: {r.get('snippet','')}"
        line_tokens = len(enc.encode(line))
        
        if token_count + line_tokens > max_tokens:
            break
        
        output_lines.append(line)
        token_count += line_tokens
    
    return '\n'.join(output_lines)

Step 3: Create a daily budget tracker

Track total token and credit usage across all search calls.

Python
from dataclasses import dataclass, field
from datetime import date

@dataclass
class SearchBudget:
    daily_credit_limit: int = 100
    daily_token_limit: int = 50000
    credits_used: int = 0
    tokens_used: int = 0
    date: str = field(default_factory=lambda: date.today().isoformat())
    
    def can_search(self) -> bool:
        if self.date != date.today().isoformat():
            self.reset()
        return self.credits_used < self.daily_credit_limit
    
    def record(self, tokens: int):
        self.credits_used += 1
        self.tokens_used += tokens
    
    def reset(self):
        self.credits_used = 0
        self.tokens_used = 0
        self.date = date.today().isoformat()
    
    def remaining(self) -> dict:
        return {'credits': self.daily_credit_limit - self.credits_used,
                'tokens': self.daily_token_limit - self.tokens_used}

budget = SearchBudget(daily_credit_limit=50)

def search_with_budget(query: str, platform: str = 'google', max_tokens: int = 300) -> str:
    if not budget.can_search():
        return '[Budget exceeded - no more searches today]'
    result = budgeted_search(query, platform, max_tokens)
    budget.record(len(enc.encode(result)))
    return result

Step 4: Integrate with agent framework

Wire the budgeted search into your agent's tool system.

Python
# Example with a simple agent loop:
def agent_research(question: str, budget: SearchBudget) -> str:
    # Adaptive token budget based on remaining allowance
    remaining = budget.remaining()
    per_search_budget = min(300, remaining['tokens'] // 3)  # Reserve for 3 searches
    
    # Search with budget
    google_ctx = search_with_budget(question, 'google', per_search_budget)
    reddit_ctx = search_with_budget(question, 'reddit', per_search_budget)
    
    context = f"Google:\n{google_ctx}\n\nReddit:\n{reddit_ctx}"
    total_tokens = len(enc.encode(context))
    
    print(f"Research used {budget.credits_used} credits, {total_tokens} tokens")
    print(f"Remaining: {budget.remaining()}")
    
    return context

Python Example

Python
import requests, os, tiktoken
H = {'x-api-key': os.environ['SCAVIO_API_KEY'], 'Content-Type': 'application/json'}
enc = tiktoken.encoding_for_model('gpt-4')

def budgeted_search(query, max_tokens=300):
    r = requests.post('https://api.scavio.dev/api/v1/search', headers=H,
        json={'platform': 'google', 'query': query}).json()
    lines, tokens = [], 0
    for x in r.get('organic',[]):
        line = f"{x['title']}: {x.get('snippet','')}"
        t = len(enc.encode(line))
        if tokens + t > max_tokens: break
        lines.append(line); tokens += t
    return '\n'.join(lines)

JavaScript Example

JavaScript
// Token counting in JS (approximate, using char/4 heuristic):
async function budgetedSearch(query, maxTokens = 300) {
  const r = await fetch('https://api.scavio.dev/api/v1/search', {
    method: 'POST', headers: {'x-api-key': process.env.SCAVIO_API_KEY, 'Content-Type': 'application/json'},
    body: JSON.stringify({platform: 'google', query})
  });
  const results = (await r.json()).organic || [];
  let lines = [], tokens = 0;
  for (const x of results) {
    const line = `${x.title}: ${x.snippet || ''}`;
    const t = Math.ceil(line.length / 4);
    if (tokens + t > maxTokens) break;
    lines.push(line); tokens += t;
  }
  return lines.join('\n');
}

Expected Output

JSON
A token budget system that controls how many tokens search results consume in your agent's context window, with daily credit tracking.

Related Tutorials

Frequently Asked Questions

Most developers complete this tutorial in 15 to 30 minutes. You will need a Scavio API key (free tier works) and a working Python or JavaScript environment.

Python 3.8+. tiktoken installed (pip install tiktoken). A Scavio API key. A Scavio API key gives you 500 free credits per month.

Yes. The free tier includes 500 credits per month, which is more than enough to complete this tutorial and prototype a working solution.

Scavio has a native LangChain package (langchain-scavio), an MCP server, and a plain REST API that works with any HTTP client. This tutorial uses the raw REST API, but you can adapt to your framework of choice.

Start Building

Control how many tokens your agent spends on search results. Build a budget system that limits context window usage per tool call.