Reddit 的公共搜索功能强大,但其官方 JSON 端点受到速率限制、未分页,并且偶尔会丢失。对于需要新鲜社区数据的监控代理、研究管道和 RAG 系统,处理抓取层的搜索 API 是周末项目和生产管道之间的区别。本教程将逐步介绍如何使用 Python 进行身份验证、发送 Reddit 搜索请求以及迭代光标页面以收集帖子。
前置条件
- 安装了 Python 3.8 或更高版本
- 安装请求库(pip install requests)
- 来自 scavio.dev 的 Scavio API 密钥
- 您要搜索的查询(关键字或 subreddit 范围内的短语)
操作指南
步骤 1: 安装请求库
requests 是本教程所需的唯一依赖项。
pip install requests步骤 2: 设置您的 API 密钥
通过读取环境变量来保持凭证不受来源的影响。
import os
API_KEY = os.environ["SCAVIO_API_KEY"]步骤 3: 发送 Reddit 搜索请求
使用您的查询和可选排序 POST 到 /api/v1/reddit/search 。 Reddit 请求需要 5-15 秒,因此设置更长的客户端超时。
import requests
response = requests.post(
"https://api.scavio.dev/api/v1/reddit/search",
headers={"Authorization": f"Bearer {API_KEY}"},
json={"query": "best python web frameworks 2026", "sort": "new"},
timeout=30,
)
data = response.json()步骤 4: 迭代帖子并跟随光标
帖子位于 data.posts 下。当 data.nextCursor 不为 null 时,将其作为游标传递以获取下一页。
for post in data["data"]["posts"]:
print(f"r/{post['subreddit']} -- {post['title']}")
next_cursor = data["data"].get("nextCursor")
if next_cursor:
# call again with {"query": ..., "cursor": next_cursor}
passPython 示例
import os
import requests
API_KEY = os.environ["SCAVIO_API_KEY"]
ENDPOINT = "https://api.scavio.dev/api/v1/reddit/search"
def search_reddit(query: str, sort: str = "relevance"):
posts, cursor = [], None
while True:
body = {"query": query, "sort": sort}
if cursor:
body["cursor"] = cursor
r = requests.post(
ENDPOINT,
headers={"Authorization": f"Bearer {API_KEY}"},
json=body,
timeout=30,
)
r.raise_for_status()
data = r.json()["data"]
posts.extend(data["posts"])
cursor = data.get("nextCursor")
if not cursor or len(posts) >= 50:
break
return posts
results = search_reddit("fastapi vs django 2026", sort="new")
for p in results[:10]:
print(f"{p['score']:>6} r/{p['subreddit']} {p['title']}")JavaScript 示例
const API_KEY = process.env.SCAVIO_API_KEY;
const ENDPOINT = "https://api.scavio.dev/api/v1/reddit/search";
async function searchReddit(query, sort = "relevance") {
const posts = [];
let cursor;
while (true) {
const body = { query, sort };
if (cursor) body.cursor = cursor;
const r = await fetch(ENDPOINT, {
method: "POST",
headers: {
Authorization: `Bearer ${API_KEY}`,
"Content-Type": "application/json",
},
body: JSON.stringify(body),
});
const { data } = await r.json();
posts.push(...data.posts);
cursor = data.nextCursor;
if (!cursor || posts.length >= 50) break;
}
return posts;
}
const posts = await searchReddit("fastapi vs django 2026", "new");
posts.slice(0, 10).forEach((p) =>
console.log(`r/${p.subreddit} -- ${p.title}`)
);预期输出
{
"data": {
"searchQuery": "fastapi vs django 2026",
"totalResults": 14,
"nextCursor": "eyJjYW5kaWRhdGVzX3JldH...",
"posts": [
{
"position": 0,
"id": "t3_1smb9du",
"title": "FastAPI vs Django in 2026",
"subreddit": "Python",
"author": "python_dev",
"timestamp": "2026-04-15T16:34:40+0000",
"nsfw": false
}
]
}
}