Tutorial

How to Audit Agent Token Usage per Tool

Build a token usage tracker that measures how many tokens each tool call consumes in your AI agent. Identify expensive search calls and optimize.

Audit agent token usage per tool by wrapping each tool call with input/output size measurement and logging the results to a structured store. Search tools are often the largest token consumers in agent workflows because they return verbose HTML snippets and metadata. Knowing exactly how many tokens each Scavio search call contributes helps you set budgets, prune unnecessary fields, and reduce costs. This tutorial builds a lightweight audit layer that sits between your agent and its tools.

Prerequisites

  • Python 3.8+ installed
  • requests library installed
  • A Scavio API key from scavio.dev
  • tiktoken library installed (pip install tiktoken)

Walkthrough

Step 1: Set up the token counter

Use tiktoken to count tokens in the tool input (query) and output (results) for each API call.

Python
import tiktoken, os, requests, json, time

API_KEY = os.environ['SCAVIO_API_KEY']
enc = tiktoken.get_encoding('cl100k_base')

def count_tokens(text: str) -> int:
    return len(enc.encode(text))

audit_log = []

Step 2: Wrap the search call with auditing

Create an audited search function that measures tokens before and after each call and appends the result to the audit log.

Python
def audited_search(query: str, platform: str = 'google') -> dict:
    input_tokens = count_tokens(json.dumps({'platform': platform, 'query': query}))
    start = time.monotonic()
    resp = requests.post('https://api.scavio.dev/api/v1/search',
        headers={'x-api-key': API_KEY},
        json={'platform': platform, 'query': query}, timeout=15)
    latency_ms = round((time.monotonic() - start) * 1000)
    data = resp.json()
    output_text = json.dumps(data)
    output_tokens = count_tokens(output_text)
    audit_log.append({
        'tool': 'search', 'platform': platform, 'query': query,
        'input_tokens': input_tokens, 'output_tokens': output_tokens,
        'total_tokens': input_tokens + output_tokens, 'latency_ms': latency_ms,
        'timestamp': time.time(),
    })
    return data

Step 3: Run several queries and collect data

Execute a batch of representative queries to build up the audit log with real usage data.

Python
test_queries = [
    ('best crm for startups 2026', 'google'),
    ('wireless earbuds under 50', 'amazon'),
    ('python async tutorial', 'youtube'),
    ('is scavio api good', 'reddit'),
]

for query, platform in test_queries:
    audited_search(query, platform)
    time.sleep(0.5)

print(f'Collected {len(audit_log)} audit records')

Step 4: Generate the usage report

Aggregate the audit log to show total tokens per tool, average tokens per call, and identify the most expensive queries.

Python
def generate_report(log: list) -> None:
    total = sum(e['total_tokens'] for e in log)
    print(f'Total token usage: {total:,}')
    print(f'Average per call: {total // max(len(log), 1):,}')
    print(f'\nPer-platform breakdown:')
    platforms = {}
    for e in log:
        p = e['platform']
        platforms[p] = platforms.get(p, 0) + e['total_tokens']
    for p, tokens in sorted(platforms.items(), key=lambda x: -x[1]):
        print(f'  {p}: {tokens:,} tokens ({len([e for e in log if e["platform"] == p])} calls)')
    print(f'\nMost expensive query:')
    top = max(log, key=lambda x: x['total_tokens'])
    print(f'  "{top["query"]}" on {top["platform"]}: {top["total_tokens"]:,} tokens')

generate_report(audit_log)

Python Example

Python
import tiktoken, requests, os, json, time
H = {'x-api-key': os.environ['SCAVIO_API_KEY']}
enc = tiktoken.get_encoding('cl100k_base')

def audited_search(query, platform='google'):
    data = requests.post('https://api.scavio.dev/api/v1/search', headers=H,
        json={'platform': platform, 'query': query}).json()
    tokens = len(enc.encode(json.dumps(data)))
    print(f'{platform}:{query} -> {tokens} tokens')
    return data

audited_search('best crm 2026')
audited_search('wireless earbuds', 'amazon')

JavaScript Example

JavaScript
const H = {'x-api-key': process.env.SCAVIO_API_KEY, 'Content-Type': 'application/json'};
async function auditedSearch(query, platform = 'google') {
  const r = await fetch('https://api.scavio.dev/api/v1/search', {
    method: 'POST', headers: H, body: JSON.stringify({platform, query})
  });
  const data = await r.json();
  const chars = JSON.stringify(data).length;
  console.log(`${platform}:${query} -> ~${Math.ceil(chars / 4)} tokens`);
  return data;
}
await auditedSearch('best crm 2026');

Expected Output

JSON
A token usage report showing total tokens consumed per platform, average tokens per API call, and the most expensive individual query.

Related Tutorials

Frequently Asked Questions

Most developers complete this tutorial in 15 to 30 minutes. You will need a Scavio API key (free tier works) and a working Python or JavaScript environment.

Python 3.8+ installed. requests library installed. A Scavio API key from scavio.dev. tiktoken library installed (pip install tiktoken). A Scavio API key gives you 500 free credits per month.

Yes. The free tier includes 500 credits per month, which is more than enough to complete this tutorial and prototype a working solution.

Scavio has a native LangChain package (langchain-scavio), an MCP server, and a plain REST API that works with any HTTP client. This tutorial uses the raw REST API, but you can adapt to your framework of choice.

Start Building

Build a token usage tracker that measures how many tokens each tool call consumes in your AI agent. Identify expensive search calls and optimize.