Workflow

Agent Token Usage Optimization Flow

Reduce LLM agent token usage by pre-filtering search results with structured JSON extraction. Cut token costs by 60-80%.

Overview

LLM agents consume thousands of tokens per search result when they receive raw HTML or full snippets. This workflow pre-processes search results into structured JSON with only the fields the agent needs, reducing token consumption by 60-80% per search call.

Trigger

On agent search request (middleware)

Schedule

On every agent search call (middleware)

Workflow Steps

1

Intercept agent search query

Catch the search request from the agent before it hits the API, extracting the query and the fields the agent actually needs.

2

Execute structured search

Send the query to the search API requesting only essential fields: title, snippet, link. Exclude raw HTML, metadata, and auxiliary data.

3

Compress results

Truncate snippets to 100 characters, limit to top 3 results, and format as a minimal JSON object.

4

Return to agent

Pass the compressed results back to the agent context window, logging the token savings for monitoring.

Python Implementation

Python
import requests, os, json

H = {"x-api-key": os.environ["SCAVIO_API_KEY"], "Content-Type": "application/json"}

def optimized_search(query, max_results=3, snippet_len=100):
    r = requests.post("https://api.scavio.dev/api/v1/search", headers=H,
        json={"platform": "google", "query": query}).json()
    results = []
    for o in r.get("organic", [])[:max_results]:
        results.append({
            "title": o.get("title", ""),
            "snippet": o.get("snippet", "")[:snippet_len],
            "url": o.get("link", ""),
        })
    compressed = json.dumps(results)
    raw_size = len(json.dumps(r))
    print(f"Token savings: {raw_size} -> {len(compressed)} chars ({round((1-len(compressed)/raw_size)*100)}% reduction)")
    return results

results = optimized_search("n8n search API integration 2026")
for r in results:
    print(f"  {r['title'][:50]}")

JavaScript Implementation

JavaScript
const H = {"x-api-key": process.env.SCAVIO_API_KEY, "Content-Type": "application/json"};

async function optimizedSearch(query, maxResults = 3, snippetLen = 100) {
  const r = await fetch("https://api.scavio.dev/api/v1/search", {
    method: "POST", headers: H,
    body: JSON.stringify({platform: "google", query})
  }).then(r => r.json());
  const results = (r.organic || []).slice(0, maxResults).map(o => ({
    title: o.title || "",
    snippet: (o.snippet || "").slice(0, snippetLen),
    url: o.link || "",
  }));
  const rawSize = JSON.stringify(r).length;
  const compressedSize = JSON.stringify(results).length;
  console.log(`Token savings: ${rawSize} -> ${compressedSize} chars (${Math.round((1-compressedSize/rawSize)*100)}% reduction)`);
  return results;
}

optimizedSearch("n8n search API integration 2026").then(r =>
  r.forEach(o => console.log(`  ${o.title.slice(0, 50)}`))
);

Platforms Used

Google

Web search with knowledge graph, PAA, and AI overviews

Frequently Asked Questions

LLM agents consume thousands of tokens per search result when they receive raw HTML or full snippets. This workflow pre-processes search results into structured JSON with only the fields the agent needs, reducing token consumption by 60-80% per search call.

This workflow uses a on agent search request (middleware). On every agent search call (middleware).

This workflow uses the following Scavio platforms: google. Each platform is called via the same unified API endpoint.

Yes. Scavio's free tier includes 250 credits per month with no credit card required. That is enough to test and validate this workflow before scaling it.

Agent Token Usage Optimization Flow

Reduce LLM agent token usage by pre-filtering search results with structured JSON extraction. Cut token costs by 60-80%.