通过导出现有查询日志、将 CSE 参数映射到 API 字段、运行并行验证测试、切换生产流量以及停用 CSE 项目,将 60K Google CSE 查询迁移到 SERP API。 Google CSE 在付费层上的上限为每天 10K 查询,并且跨架构版本返回不一致的 JSON 结构。专用搜索 API 消除了这些限制,并返回所有查询类型的规范化 JSON。按 Scavio 0.005 美元/信用计算,每月 6 万次查询的成本为 300 美元,而 CSE 的成本为 5 美元/1K (300 美元),但具有结构化输出、多平台支持,并且没有每日上限。
前置条件
- 已安装 Python 3.8+
- 请求已安装库
- 来自 scavio.dev 的 Scavio API 密钥
- 访问您的 Google CSE 查询日志或分析
操作指南
步骤 1: 导出现有的 CSE 查询
从 CSE 分析或应用程序数据库中提取查询日志。按频率对查询进行分组以确定迁移测试的优先级。
import os, json, csv
# Export from your app DB or CSE analytics CSV
def load_cse_queries(csv_path: str) -> list:
queries = []
with open(csv_path) as f:
reader = csv.DictReader(f)
for row in reader:
queries.append({
'query': row['query'],
'frequency': int(row.get('count', 1)),
'cse_params': {'cx': row.get('cx', ''), 'num': row.get('num', '10')},
})
queries.sort(key=lambda q: q['frequency'], reverse=True)
print(f'Loaded {len(queries)} unique queries ({sum(q["frequency"] for q in queries)} total calls)')
return queries
# Example with mock data:
queries = [{'query': 'best crm 2026', 'frequency': 500},
{'query': 'python async tutorial', 'frequency': 200}]
print(f'Top query: {queries[0]["query"]} ({queries[0]["frequency"]} calls/mo)')步骤 2: 将 CSE 字段映射到 API 参数
将 cx、num、start 和 lr 等 Google CSE 参数映射到 Scavio API 等效项。大多数 CSE 参数直接映射或变得不必要。
import requests
API_KEY = os.environ['SCAVIO_API_KEY']
def map_cse_to_scavio(cse_query: dict) -> dict:
"""Map CSE parameters to Scavio API format."""
return {
'platform': 'google',
'query': cse_query['query'],
# CSE 'num' -> results come full by default
# CSE 'cx' -> not needed, full Google index
# CSE 'lr' -> use country/language params if needed
}
def test_single(query: dict) -> dict:
payload = map_cse_to_scavio(query)
resp = requests.post('https://api.scavio.dev/api/v1/search',
headers={'x-api-key': API_KEY},
json=payload, timeout=15)
resp.raise_for_status()
results = resp.json().get('organic_results', [])
return {'query': query['query'], 'results': len(results), 'status': 'ok'}
print(test_single({'query': 'best crm 2026'}))步骤 3: 运行并行验证测试
并行测试前 100 个查询,以验证结果质量并在切换生产之前测量延迟。
from concurrent.futures import ThreadPoolExecutor
import time
def validate_batch(queries: list, max_workers: int = 5) -> dict:
start = time.monotonic()
results = []
with ThreadPoolExecutor(max_workers=max_workers) as pool:
results = list(pool.map(test_single, queries[:100]))
elapsed = time.monotonic() - start
passed = sum(1 for r in results if r['results'] > 0)
failed = [r for r in results if r['results'] == 0]
print(f'Validated {len(results)} queries in {elapsed:.1f}s')
print(f'Passed: {passed}, Empty: {len(failed)}')
if failed:
print(f'Empty queries: {[f["query"] for f in failed[:5]]}')
return {'passed': passed, 'failed': len(failed), 'elapsed': round(elapsed, 1)}
validate_batch([{'query': 'best crm 2026'}, {'query': 'python async'}])步骤 4: 切换生产流量
将生产代码中的 CSE HTTP 调用替换为 Scavio 端点。使用功能标志逐步推出。
# Production switch with feature flag
def search(query: str, use_scavio: bool = True) -> list:
if use_scavio:
resp = requests.post('https://api.scavio.dev/api/v1/search',
headers={'x-api-key': API_KEY},
json={'platform': 'google', 'query': query}, timeout=10)
return resp.json().get('organic_results', [])
else:
# Legacy CSE call (keep as fallback during migration)
resp = requests.get('https://www.googleapis.com/customsearch/v1',
params={'key': os.environ.get('CSE_KEY', ''), 'cx': os.environ.get('CSE_CX', ''), 'q': query})
return resp.json().get('items', [])
results = search('best crm 2026', use_scavio=True)
print(f'{len(results)} results from Scavio')步骤 5: 退役 CSE 项目
Scavio 流量稳定 7 天后,禁用 CSE 计费、删除 CSE 凭据并存档迁移日志。
# Post-migration checklist:
# 1. Verify 7 days of zero CSE calls in your logs
# 2. Disable CSE API key in Google Cloud Console
# 3. Remove CSE environment variables
# 4. Archive migration validation results
def migration_report(validation: dict, daily_queries: int = 2000) -> str:
monthly = daily_queries * 30
cse_cost = monthly * 0.005 # $5/1K queries
scavio_cost = monthly * 0.005 # $0.005/credit
report = f'Migration complete: {monthly} queries/month\n'
report += f'CSE cost was: ${cse_cost:.0f}/mo\n'
report += f'Scavio cost: ${scavio_cost:.0f}/mo\n'
report += f'Validation: {validation["passed"]} passed, {validation["failed"]} empty\n'
report += 'CSE project can be decommissioned.'
return report
print(migration_report({'passed': 98, 'failed': 2}))Python 示例
import requests, os
H = {'x-api-key': os.environ['SCAVIO_API_KEY']}
def search(query):
data = requests.post('https://api.scavio.dev/api/v1/search', headers=H,
json={'platform': 'google', 'query': query}).json()
return data.get('organic_results', [])
# Replaces: requests.get('googleapis.com/customsearch/v1', params={...})
results = search('best crm 2026')
print(f'{len(results)} results')JavaScript 示例
const H = {'x-api-key': process.env.SCAVIO_API_KEY, 'Content-Type': 'application/json'};
async function search(query) {
const r = await fetch('https://api.scavio.dev/api/v1/search', {
method: 'POST', headers: H, body: JSON.stringify({platform: 'google', query})
});
return (await r.json()).organic_results || [];
}
// Replaces: fetch(`googleapis.com/customsearch/v1?key=...&cx=...&q=${query}`)
search('best crm 2026').then(r => console.log(r.length + ' results'));预期输出
A fully migrated search pipeline serving 60K+ monthly queries through Scavio instead of Google CSE, with validated result quality and a decommission checklist.