Google 自定义搜索引擎免费套餐将您限制为 50 个特定域。如果您的应用需要搜索整个网络,您要么向 Google 每 1,000 个查询支付 5 美元,要么切换提供商。本教程将 50 个域的 Google CSE 设置迁移到 Scavio,它可以以每次查询 0.005 美元的价格搜索整个网络,没有域上限。迁移会保留您现有的结果格式,因此下游代码需要零更改。
前置条件
- Python 3.9+ 或 Node.js 18+
- 您当前的 Google CSE 配置(CX ID、域列表)
- 来自 scavio.dev 的 Scavio API 密钥
操作指南
步骤 1: 导出您的 Google CSE 域列表
从您的 Google CSE 配置中提取域。如果您想限制范围,这些将成为 Scavio 中的可选站点过滤器。
# Your 50 Google CSE domains (example)
google_cse_domains = [
'docs.python.org', 'stackoverflow.com', 'github.com',
'developer.mozilla.org', 'reactjs.org', 'nextjs.org',
'vuejs.org', 'angular.io', 'nodejs.org', 'npmjs.com',
# ... up to 50 domains
]
print(f'Migrating {len(google_cse_domains)} domains from Google CSE')
print(f'Google CSE: limited to these {len(google_cse_domains)} domains')
print(f'Scavio: searches entire web (no domain cap)')
print(f'Cost comparison:')
print(f' Google CSE paid: $5.00 per 1,000 queries')
print(f' Scavio: $5.00 per 1,000 queries ($0.005 each)')
print(f' Scavio free: 250 queries/month included')步骤 2: 构建迁移适配器
创建一个与 Google CSE 调用具有相同接口但使用 Scavio 的函数。可以选择使用 site: 查询限制您的域列表。
import requests, os
SCAVIO_KEY = os.environ['SCAVIO_API_KEY']
def search(query: str, num: int = 10, restrict_domains: list = None) -> list:
"""Drop-in replacement for Google CSE. Same return shape."""
search_query = query
if restrict_domains:
# Use site: operator to limit to specific domains
sites = ' OR '.join(f'site:{d}' for d in restrict_domains[:5])
search_query = f'{query} ({sites})'
resp = requests.post('https://api.scavio.dev/api/v1/search',
headers={'x-api-key': SCAVIO_KEY, 'Content-Type': 'application/json'},
json={'query': search_query, 'country_code': 'us', 'num_results': num})
resp.raise_for_status()
# Return in Google CSE items[] format
return [{'title': r['title'], 'link': r['link'],
'snippet': r.get('snippet', ''),
'displayLink': r['link'].split('/')[2] if '/' in r['link'] else ''}
for r in resp.json().get('organic_results', [])]
# Full web search (no domain limit)
results = search('python asyncio tutorial', num=5)
print(f'Full web: {len(results)} results')
for r in results:
print(f' {r["displayLink"]}: {r["title"][:50]}')步骤 3: 通过比较测试验证迁移
通过两个提供商运行相同的查询,以确保在完全切换之前结果质量相同。
queries = ['python asyncio', 'react hooks tutorial', 'docker compose networking']
for q in queries:
# With domain restriction (mimics old Google CSE behavior)
restricted = search(q, num=3, restrict_domains=['docs.python.org', 'stackoverflow.com', 'github.com'])
# Full web (new capability)
full = search(q, num=3)
print(f'Query: {q}')
print(f' Restricted ({len(restricted)} results): {restricted[0]["displayLink"] if restricted else "none"}')
print(f' Full web ({len(full)} results): {full[0]["displayLink"] if full else "none"}')
print()Python 示例
import requests, os
SCAVIO_KEY = os.environ['SCAVIO_API_KEY']
def search(query, num=10):
resp = requests.post('https://api.scavio.dev/api/v1/search',
headers={'x-api-key': SCAVIO_KEY, 'Content-Type': 'application/json'},
json={'query': query, 'country_code': 'us', 'num_results': num})
return [{'title': r['title'], 'link': r['link'], 'snippet': r.get('snippet', '')}
for r in resp.json().get('organic_results', [])]
results = search('python asyncio tutorial')
for r in results[:5]:
print(f'{r["title"]}: {r["link"]}')JavaScript 示例
const SCAVIO_KEY = process.env.SCAVIO_API_KEY;
async function search(query, num = 10) {
const resp = await fetch('https://api.scavio.dev/api/v1/search', {
method: 'POST',
headers: { 'x-api-key': SCAVIO_KEY, 'Content-Type': 'application/json' },
body: JSON.stringify({ query, country_code: 'us', num_results: num })
});
const data = await resp.json();
return (data.organic_results || []).map(r => ({
title: r.title, link: r.link, snippet: r.snippet || ''
}));
}
search('python asyncio tutorial').then(r => r.slice(0, 5).forEach(x => console.log(x.title)));预期输出
Full web: 10 results
docs.python.org: Python asyncio -- Asynchronous I/O
realpython.com: Async IO in Python: A Complete Walkt
stackoverflow.com: How to use asyncio in Python 3
Query: python asyncio
Restricted (3 results): docs.python.org
Full web (3 results): docs.python.org
Cost: $0.005/query, no domain cap