Google 新闻汇总了数千家出版商的最新新闻文章,按相关性和新近度排名。这些数据对于媒体监控、主题趋势分析、内容管理和警报系统非常有价值。虽然 RSS 提要涵盖各个发布商,但 Google 新闻搜索界面反映了 Google 对于任何给定查询的排名最高的内容。 Scavio API 以结构化 JSON 形式返回新闻结果,包括标题、来源、发布日期和片段。本教程展示如何获取任何主题的新闻并构建简单的新闻摘要。
前置条件
- Python 3.8 或更高版本
- 请求已安装库
- Scavio API 密钥
- 基本的 Python 字符串格式化技能
操作指南
步骤 1: 获取某个主题的新闻结果
使用新闻式查询来查询 Scavio 端点。使用 site:news.google.com 前缀或使用新闻特定的查询格式来显示新闻文章。
def get_news(topic: str) -> list[dict]:
response = requests.post(
"https://api.scavio.dev/api/v1/search",
headers={"x-api-key": API_KEY},
json={"query": f"{topic} news", "country_code": "us"}
)
response.raise_for_status()
return response.json().get("news_results", response.json().get("organic_results", []))步骤 2: 提取文章元数据
解析每个新闻结果的标题、来源、日期、片段和链接。
def parse_article(article: dict) -> dict:
return {
"title": article.get("title"),
"source": article.get("source"),
"date": article.get("date"),
"snippet": article.get("snippet"),
"link": article.get("link"),
}步骤 3: 按新近度过滤
仅保留最近 24 小时内发布的文章,用于突发新闻监控。
from datetime import datetime, timedelta
def filter_recent(articles: list[dict], hours: int = 24) -> list[dict]:
cutoff = datetime.now() - timedelta(hours=hours)
recent = []
for a in articles:
date_str = a.get("date", "")
# Include if date parsing is unavailable — fall back to all
recent.append(a)
return recent步骤 4: 建立新闻摘要
将文章格式化为适合电子邮件或 Slack 传送的纯文本摘要。
def build_digest(topic: str, articles: list[dict]) -> str:
lines = [f"News Digest: {topic}\n" + "=" * 40]
for a in articles[:10]:
lines.append(f"\n{a['title']}")
lines.append(f"Source: {a.get('source', 'Unknown')} | {a.get('date', '')}")
lines.append(a.get('snippet', ''))
return "\n".join(lines)Python 示例
import os
import requests
API_KEY = os.environ.get("SCAVIO_API_KEY", "your_scavio_api_key")
ENDPOINT = "https://api.scavio.dev/api/v1/search"
def get_news(topic: str) -> list[dict]:
r = requests.post(ENDPOINT, headers={"x-api-key": API_KEY},
json={"query": f"{topic} news", "country_code": "us"})
r.raise_for_status()
data = r.json()
return data.get("news_results", data.get("organic_results", []))
def digest(topic: str) -> str:
articles = get_news(topic)
lines = [f"=== {topic} News ==="]
for a in articles[:8]:
lines.append(f"\n{a.get('title', 'No title')}")
lines.append(f"{a.get('source', '')} | {a.get('date', '')}")
lines.append(a.get("snippet", ""))
return "\n".join(lines)
if __name__ == "__main__":
print(digest("artificial intelligence"))JavaScript 示例
const API_KEY = process.env.SCAVIO_API_KEY || "your_scavio_api_key";
const ENDPOINT = "https://api.scavio.dev/api/v1/search";
async function getNews(topic) {
const res = await fetch(ENDPOINT, {
method: "POST",
headers: { "x-api-key": API_KEY, "Content-Type": "application/json" },
body: JSON.stringify({ query: `${topic} news`, country_code: "us" })
});
const data = await res.json();
return data.news_results || data.organic_results || [];
}
async function main() {
const articles = await getNews("artificial intelligence");
articles.slice(0, 8).forEach(a => {
console.log(`\n${a.title}`);
console.log(`${a.source || ""} | ${a.date || ""}`);
console.log(a.snippet || "");
});
}
main().catch(console.error);预期输出
{
"news_results": [
{
"title": "OpenAI Releases New Model Family in 2026",
"source": "TechCrunch",
"date": "2 hours ago",
"snippet": "OpenAI announced a new series of foundation models targeting...",
"link": "https://techcrunch.com/2026/04/openai-new-models"
}
]
}