How long does this calculate the real cost per agent search query tutorial take?

Most developers complete this tutorial in 15 to 30 minutes. You will need a Scavio API key (free tier works) and a working Python or JavaScript environment.

What do I need before starting?

Python 3.9+ installed. requests library installed. A Scavio API key from scavio.dev. Basic understanding of LLM token pricing. A Scavio API key gives you 250 free credits per month.

Can I run this tutorial with the free tier?

Yes. The free tier includes 250 credits per month, which is more than enough to complete this tutorial and prototype a working solution.

What frameworks does this work with?

Scavio has a native LangChain package (langchain-scavio), an MCP server, and a plain REST API that works with any HTTP client. This tutorial uses the raw REST API, but you can adapt to your framework of choice.

True Cost Per AI Agent Search Query (2026)

The sticker price of a search API call ($0.005 for Scavio, $5/1K for Brave, $5-12/1K for Perplexity Sonar) hides the real cost. Agents retry failed queries, expand searches when results are poor, and the LLM itself consumes tokens processing search context. This tutorial builds a cost tracker that captures every API call an agent makes and computes the true per-query cost including retries, fallbacks, and token overhead.

Prerequisites

Python 3.9+ installed
requests library installed
A Scavio API key from scavio.dev
Basic understanding of LLM token pricing

Walkthrough

Step 1: Build the cost tracking wrapper

Wrap your search function to log every call, its cost, and whether it was a retry or fallback.

Python

import time
from dataclasses import dataclass, field
from typing import List

@dataclass
class SearchCall:
    provider: str
    query: str
    cost: float
    latency_ms: int
    result_count: int
    call_type: str  # 'primary', 'retry', 'fallback'
    timestamp: float = field(default_factory=time.time)

class CostTracker:
    def __init__(self):
        self.calls: List[SearchCall] = []

    def log(self, call: SearchCall):
        self.calls.append(call)

    @property
    def total_cost(self) -> float:
        return sum(c.cost for c in self.calls)

    @property
    def total_queries(self) -> int:
        return len([c for c in self.calls if c.call_type == 'primary'])

    @property
    def cost_per_query(self) -> float:
        primary = self.total_queries
        return self.total_cost / primary if primary > 0 else 0

    def summary(self) -> str:
        retries = len([c for c in self.calls if c.call_type == 'retry'])
        fallbacks = len([c for c in self.calls if c.call_type == 'fallback'])
        return (f'Queries: {self.total_queries} | Total calls: {len(self.calls)} | '
                f'Retries: {retries} | Fallbacks: {fallbacks} | '
                f'Total: ${self.total_cost:.4f} | Per query: ${self.cost_per_query:.4f}')

tracker = CostTracker()
print('Cost tracker initialized')

Step 2: Instrument your search function

Add cost tracking to your search calls. Each search logs its provider, cost, and type (primary, retry, or fallback).

Python

import requests, os

SCAVIO_KEY = os.environ['SCAVIO_API_KEY']

PROVIDER_COSTS = {
    'scavio': 0.005,
    'brave': 0.005,
    'tavily': 0.03,     # $30/1K on Researcher plan
    'perplexity': 0.005, # Sonar basic
    'google_cse': 0.005, # $5/1K paid tier
}

def tracked_search(query: str, provider: str = 'scavio',
                   call_type: str = 'primary') -> list:
    start = time.time()
    if provider == 'scavio':
        resp = requests.post('https://api.scavio.dev/api/v1/search',
            headers={'x-api-key': SCAVIO_KEY, 'Content-Type': 'application/json'},
            json={'query': query, 'country_code': 'us', 'num_results': 10})
        results = resp.json().get('organic_results', [])
    else:
        results = []  # placeholder for other providers
    latency = int((time.time() - start) * 1000)
    cost = PROVIDER_COSTS.get(provider, 0.005)
    tracker.log(SearchCall(
        provider=provider, query=query, cost=cost,
        latency_ms=latency, result_count=len(results), call_type=call_type
    ))
    return results

# Simulate a typical agent session
tracked_search('best CRM tools 2026')  # primary
tracked_search('CRM pricing comparison 2026')  # primary
tracked_search('CRM pricing comparison 2026', call_type='retry')  # retry
print(tracker.summary())

Step 3: Add LLM token cost to the calculation

Search results are fed to an LLM as context. Calculate the token cost of processing search results to get the true total cost.

Python

def estimate_token_cost(results: list, model: str = 'claude-sonnet') -> float:
    """Estimate LLM token cost for processing search results."""
    TOKEN_COSTS = {
        'claude-sonnet': {'input': 3.0 / 1_000_000, 'output': 15.0 / 1_000_000},
        'claude-haiku': {'input': 0.25 / 1_000_000, 'output': 1.25 / 1_000_000},
        'gpt-4o': {'input': 2.5 / 1_000_000, 'output': 10.0 / 1_000_000},
        'gpt-4o-mini': {'input': 0.15 / 1_000_000, 'output': 0.6 / 1_000_000},
    }
    costs = TOKEN_COSTS.get(model, TOKEN_COSTS['claude-sonnet'])
    # Estimate tokens: ~1.3 tokens per word
    total_words = sum(len(r.get('snippet', '').split()) + len(r.get('title', '').split())
                      for r in results)
    input_tokens = int(total_words * 1.3)
    output_tokens = 200  # typical agent response
    input_cost = input_tokens * costs['input']
    output_cost = output_tokens * costs['output']
    return round(input_cost + output_cost, 6)

# Example
results = tracked_search('python web framework comparison 2026')
llm_cost = estimate_token_cost(results)
search_cost = 0.005
total = search_cost + llm_cost
print(f'Search cost: ${search_cost:.4f}')
print(f'LLM cost:    ${llm_cost:.6f}')
print(f'Total cost:  ${total:.4f}')
print(f'Search is {search_cost/total*100:.0f}% of total per-query cost')

Step 4: Generate a cost comparison report

Compare the true per-query cost across providers, factoring in their typical retry rates and result quality.

Python

def cost_report():
    providers = {
        'scavio': {'price': 0.005, 'retry_rate': 0.02, 'free_monthly': 250},
        'brave': {'price': 0.005, 'retry_rate': 0.05, 'free_monthly': 1000},
        'tavily': {'price': 0.03, 'retry_rate': 0.03, 'free_monthly': 1000},
        'perplexity_sonar': {'price': 0.005, 'retry_rate': 0.04, 'free_monthly': 0},
        'google_cse': {'price': 0.005, 'retry_rate': 0.01, 'free_monthly': 0},
    }
    monthly_queries = 5000
    print(f'Cost comparison for {monthly_queries:,} queries/month:\n')
    print(f'{"Provider":<20} {"Price/q":>8} {"Retries":>8} {"True/q":>8} {"Monthly":>10} {"After free":>10}')
    print('-' * 70)
    for name, p in providers.items():
        true_cost = p['price'] * (1 + p['retry_rate'])
        billable = max(0, monthly_queries - p['free_monthly'])
        monthly = billable * true_cost
        print(f'{name:<20} ${p["price"]:.3f}   {p["retry_rate"]*100:5.1f}%  '
              f'${true_cost:.4f}  ${monthly_queries * true_cost:>8.2f}  ${monthly:>8.2f}')

cost_report()

Python Example

Python

import requests, os, time
from dataclasses import dataclass

SCAVIO_KEY = os.environ['SCAVIO_API_KEY']
calls = []

def search(query, call_type='primary'):
    start = time.time()
    resp = requests.post('https://api.scavio.dev/api/v1/search',
        headers={'x-api-key': SCAVIO_KEY, 'Content-Type': 'application/json'},
        json={'query': query, 'country_code': 'us', 'num_results': 10})
    results = resp.json().get('organic_results', [])
    calls.append({'cost': 0.005, 'type': call_type, 'results': len(results)})
    return results

search('CRM tools 2026')
search('CRM pricing 2026')
search('CRM pricing 2026', 'retry')
total = sum(c['cost'] for c in calls)
primary = len([c for c in calls if c['type'] == 'primary'])
print(f'Total: ${total:.3f}, Per query: ${total/primary:.4f}')

JavaScript Example

JavaScript

const SCAVIO_KEY = process.env.SCAVIO_API_KEY;
const calls = [];

async function search(query, callType = 'primary') {
  const resp = await fetch('https://api.scavio.dev/api/v1/search', {
    method: 'POST',
    headers: { 'x-api-key': SCAVIO_KEY, 'Content-Type': 'application/json' },
    body: JSON.stringify({ query, country_code: 'us', num_results: 10 })
  });
  const results = (await resp.json()).organic_results || [];
  calls.push({ cost: 0.005, type: callType, results: results.length });
  return results;
}

await search('CRM tools 2026');
await search('CRM pricing 2026');
await search('CRM pricing 2026', 'retry');
const total = calls.reduce((s, c) => s + c.cost, 0);
const primary = calls.filter(c => c.type === 'primary').length;
console.log(`Total: $${total.toFixed(3)}, Per query: $${(total/primary).toFixed(4)}`);

Expected Output

JSON

Queries: 3 | Total calls: 4 | Retries: 1 | Fallbacks: 0 | Total: $0.0200 | Per query: $0.0067

Search cost: $0.0050
LLM cost:    $0.003200
Total cost:  $0.0082
Search is 61% of total per-query cost

Cost comparison for 5,000 queries/month:

Provider             Price/q  Retries  True/q    Monthly  After free
----------------------------------------------------------------------
scavio               $0.005     2.0%  $0.0051    $25.50      $24.23
brave                $0.005     5.0%  $0.0053    $26.25      $20.98
tavily               $0.030     3.0%  $0.0309   $154.50     $123.60
perplexity_sonar     $0.005     4.0%  $0.0052    $26.00      $26.00
google_cse           $0.005     1.0%  $0.0051    $25.25      $25.25

How to Calculate the Real Cost Per Agent Search Query

Prerequisites

Walkthrough

Step 1: Build the cost tracking wrapper

Step 2: Instrument your search function

Step 3: Add LLM token cost to the calculation

Step 4: Generate a cost comparison report

Python Example

JavaScript Example

Expected Output

Related Tutorials

Frequently Asked Questions

How long does this calculate the real cost per agent search query tutorial take?

What do I need before starting?

Can I run this tutorial with the free tier?

What frameworks does this work with?

Start Building