Workflow

Agent Token Usage Audit Pipeline

Track and reduce token spend per agent tool call. Audit search context size and optimize for cost.

Overview

Agent teams know their total LLM spend but rarely know how much each tool call contributes. This workflow audits the token cost of search-augmented agent calls by measuring context size before and after search injection. It identifies which queries produce oversized contexts, which tool calls return low-value results, and where structured search could replace raw page fetches to cut costs.

Trigger

Cron schedule (weekly Monday at 7 AM UTC)

Schedule

Weekly Monday at 7 AM UTC

Workflow Steps

1

Collect agent logs

Pull the past week's agent interaction logs including tool calls, context sizes, and search queries.

2

Measure search context tokens

For each search-related tool call, calculate the token count of the injected search context.

3

Benchmark against structured search

Re-run a sample of queries through Scavio's structured API. Compare token count of structured results versus the original context.

4

Calculate savings potential

Estimate weekly token and cost savings if all search contexts used structured results.

5

Identify worst offenders

Rank queries by context size. Flag the top 10 most expensive search-grounded calls.

6

Generate audit report

Create a summary with total search token spend, savings potential, and specific queries to optimize.

Python Implementation

Python
import requests, os, json

H = {"x-api-key": os.environ["SCAVIO_API_KEY"]}

def audit_query(query, platform="google"):
    """Compare structured API context size vs typical raw page fetch."""
    r = requests.post("https://api.scavio.dev/api/v1/search", headers=H,
        json={"platform": platform, "query": query}, timeout=10).json()
    structured_context = ""
    for o in r.get("organic", [])[:5]:
        structured_context += f"{o.get('title')}: {o.get('snippet')}\n"
    structured_tokens = len(structured_context) // 4
    estimated_raw_tokens = 8000 * 5  # typical 5-page raw fetch
    return {
        "query": query,
        "structured_tokens": structured_tokens,
        "estimated_raw_tokens": estimated_raw_tokens,
        "savings_pct": round((1 - structured_tokens / estimated_raw_tokens) * 100, 1),
        "savings_tokens": estimated_raw_tokens - structured_tokens
    }

SAMPLE_QUERIES = [
    "best search api for agents 2026",
    "how to reduce llm token cost",
    "search api pricing comparison"
]

total_savings = 0
for q in SAMPLE_QUERIES:
    result = audit_query(q)
    total_savings += result["savings_tokens"]
    print(json.dumps(result))
print(f"\nTotal token savings across sample: {total_savings:,}")

JavaScript Implementation

JavaScript
const H = {"x-api-key": process.env.SCAVIO_API_KEY, "Content-Type": "application/json"};

async function auditQuery(query, platform = "google") {
  const r = await fetch("https://api.scavio.dev/api/v1/search", {
    method: "POST", headers: H,
    body: JSON.stringify({platform, query})
  }).then(r => r.json());
  let ctx = "";
  for (const o of (r.organic || []).slice(0, 5)) {
    ctx += o.title + ": " + o.snippet + "\n";
  }
  const structuredTokens = Math.ceil(ctx.length / 4);
  const rawTokens = 8000 * 5;
  return {
    query, structuredTokens, estimatedRawTokens: rawTokens,
    savingsPct: Math.round((1 - structuredTokens / rawTokens) * 1000) / 10,
    savingsTokens: rawTokens - structuredTokens
  };
}

Platforms Used

Google

Web search with knowledge graph, PAA, and AI overviews

Reddit

Community, posts & threaded comments from any subreddit

YouTube

Video search with transcripts and metadata

Frequently Asked Questions

Agent teams know their total LLM spend but rarely know how much each tool call contributes. This workflow audits the token cost of search-augmented agent calls by measuring context size before and after search injection. It identifies which queries produce oversized contexts, which tool calls return low-value results, and where structured search could replace raw page fetches to cut costs.

This workflow uses a cron schedule (weekly monday at 7 am utc). Weekly Monday at 7 AM UTC.

This workflow uses the following Scavio platforms: google, reddit, youtube. Each platform is called via the same unified API endpoint.

Yes. Scavio's free tier includes 500 credits per month with no credit card required. That is enough to test and validate this workflow before scaling it.

Agent Token Usage Audit Pipeline

Track and reduce token spend per agent tool call. Audit search context size and optimize for cost.