获取原始网页的代理会消耗数千个令牌来解析 HTML 以找到一些事实。结构化搜索 API 仅返回每个查询 150-300 个标记中的相关数据(标题、摘要、URL)。本教程展示如何用结构化搜索替换页面获取,以将令牌成本降低 80-90%。
前置条件
- 获取网页信息的现有代理
- Python 3.8+ 或 Node.js 18+
- Scavio API 密钥
操作指南
步骤 1: 测量当前代币使用情况
计算当前的网络获取方法每次搜索使用多少个令牌。
Python
import tiktoken
# Typical raw web page fetch (e.g., via requests + BeautifulSoup or Fetch MCP):
raw_page_tokens = 5000 # Average web page after HTML stripping
useful_tokens = 200 # What the LLM actually needs from that page
waste_ratio = (raw_page_tokens - useful_tokens) / raw_page_tokens
print(f'Current waste: {waste_ratio:.0%} of tokens are unused context')
# Output: Current waste: 96% of tokens are unused context
# With structured search API:
structured_tokens = 250 # Average Scavio response (5 results)
print(f'Structured approach: {structured_tokens} tokens per search')
print(f'Savings: {(raw_page_tokens - structured_tokens) / raw_page_tokens:.0%}')
# Output: Savings: 95%步骤 2: 用搜索替换获取和解析
用结构化搜索 API 调用替换网页获取。
Python
import requests, os
H = {'x-api-key': os.environ['SCAVIO_API_KEY'], 'Content-Type': 'application/json'}
# BEFORE: Fetch full page and extract info (5000+ tokens)
# page = requests.get(url).text
# soup = BeautifulSoup(page, 'html.parser')
# text = soup.get_text()[:3000] # Still 1000+ tokens
# AFTER: Get structured results (250 tokens)
def efficient_search(query: str) -> str:
resp = requests.post('https://api.scavio.dev/api/v1/search', headers=H,
json={'platform': 'google', 'query': query}, timeout=10)
results = resp.json().get('organic', [])[:5]
# Return only what the LLM needs
return '\n'.join(f"{r['title']}: {r.get('snippet','')}" for r in results)步骤 3: 设置每个工具调用的代币预算
配置您的代理以对搜索结果实施令牌限制。
Python
MAX_SEARCH_TOKENS = 500 # Hard limit per search tool call
def budget_search(query: str, max_tokens: int = MAX_SEARCH_TOKENS) -> str:
resp = requests.post('https://api.scavio.dev/api/v1/search', headers=H,
json={'platform': 'google', 'query': query}, timeout=10)
results = resp.json().get('organic', [])[:5]
output_lines = []
token_count = 0
enc = tiktoken.encoding_for_model('gpt-4')
for r in results:
line = f"{r['title']}: {r.get('snippet','')}"
line_tokens = len(enc.encode(line))
if token_count + line_tokens > max_tokens:
break
output_lines.append(line)
token_count += line_tokens
return '\n'.join(output_lines)步骤 4: 计算节省的成本
比较不同方法之间的代币成本差异。
Python
# Cost comparison (Claude Sonnet 4.6 pricing):
input_cost_per_1m = 3.0 # $3/M input tokens
# Old approach: 5000 tokens/search * 100 searches/day = 500K tokens/day
old_daily_cost = (500_000 / 1_000_000) * input_cost_per_1m
print(f'Old approach: ${old_daily_cost:.2f}/day ({500_000} tokens)')
# New approach: 250 tokens/search * 100 searches/day = 25K tokens/day
new_daily_cost = (25_000 / 1_000_000) * input_cost_per_1m
print(f'New approach: ${new_daily_cost:.4f}/day ({25_000} tokens)')
# Plus Scavio API cost: 100 searches * $0.005 = $0.50/day
scavio_cost = 100 * 0.005
print(f'Scavio API cost: ${scavio_cost:.2f}/day')
print(f'Total new: ${new_daily_cost + scavio_cost:.2f}/day')
print(f'Savings: ${old_daily_cost - new_daily_cost - scavio_cost:.2f}/day')Python 示例
Python
import requests, os
H = {'x-api-key': os.environ['SCAVIO_API_KEY'], 'Content-Type': 'application/json'}
def efficient_search(query, max_results=3):
r = requests.post('https://api.scavio.dev/api/v1/search', headers=H,
json={'platform': 'google', 'query': query}).json()
return '\n'.join(f"{x['title']}: {x.get('snippet','')}" for x in r.get('organic',[])[:max_results])JavaScript 示例
JavaScript
async function efficientSearch(query, maxResults = 3) {
const r = await fetch('https://api.scavio.dev/api/v1/search', {
method: 'POST', headers: {'x-api-key': process.env.SCAVIO_API_KEY, 'Content-Type': 'application/json'},
body: JSON.stringify({platform: 'google', query})
});
return (await r.json()).organic?.slice(0, maxResults)
.map(x => `${x.title}: ${x.snippet}`).join('\n');
}预期输出
JSON
An agent that uses 80-95% fewer tokens per search by getting structured results instead of fetching full web pages.