ScavioScavio
产品定价文档
登录开始使用
  1. 首页
  2. 教程
  3. 如何构建具有实时搜索回退功能的 RAG 管道
教程

如何构建具有实时搜索回退功能的 RAG 管道

将搜索 API 后备添加到 RAG 管道。当矢量检索返回低置信度结果时,自动回退到实时网络搜索。

获取免费API密钥API文档

当向量存储不包含相关文档时,RAG 管道会默默地失败:LLM 接收到不良上下文并生成听起来合理但错误的答案。实时搜索回退通过检查检索置信度并在向量存储不足时路由到 Web 搜索来捕获这些失败。本教程向任何现有的 RAG 管道添加了一个后备层,用于检测低质量检索并透明地切换到实时 SERP 数据。仅在需要时才会触发回退,从而将每次搜索调用的成本保持在 0.005 美元的最低水平。

前置条件

  • 已安装 Python 3.9+
  • 具有矢量存储的现有 RAG 管道
  • 请求已安装库
  • 来自 scavio.dev 的 Scavio API 密钥

操作指南

步骤 1: 建立置信度评分函数

对矢量检索结果与查询的匹配程度进行评分。低分会触发搜索回退。使用向量存储中的相似度分数或简单的启发式方法。

Python
def score_retrieval_quality(query: str, documents: list, scores: list = None) -> float:
    """Score retrieval quality from 0 (terrible) to 1 (excellent)."""
    if not documents:
        return 0.0
    # If vector store provides similarity scores, use them
    if scores:
        avg_score = sum(scores) / len(scores)
        return min(avg_score, 1.0)
    # Heuristic: check keyword overlap between query and docs
    query_words = set(query.lower().split())
    total_overlap = 0
    for doc in documents:
        doc_words = set(doc.lower().split()[:200])  # first 200 words
        overlap = len(query_words & doc_words) / max(len(query_words), 1)
        total_overlap += overlap
    avg_overlap = total_overlap / len(documents)
    return min(avg_overlap, 1.0)

# Example:
score = score_retrieval_quality(
    'latest python version 2026',
    ['Python 3.12 was released in October 2023 with improved performance.']
)
print(f'Retrieval confidence: {score:.2f}')  # Low because doc is outdated

步骤 2: 构建搜索后备功能

当检索置信度低于阈值时,获取实时搜索结果并将其格式化为 LLM 上下文的文档。

Python
import requests, os

API_KEY = os.environ['SCAVIO_API_KEY']

def search_fallback(query: str, k: int = 5) -> list:
    """Fetch live search results as fallback documents."""
    resp = requests.post('https://api.scavio.dev/api/v1/search',
        headers={'x-api-key': API_KEY, 'Content-Type': 'application/json'},
        json={'query': query, 'country_code': 'us'})
    resp.raise_for_status()
    results = resp.json().get('organic_results', [])[:k]
    return [{
        'content': f'{r["title"]}\n{r.get("snippet", "")}',
        'source': r['link'],
        'retriever': 'live_search'
    } for r in results]

# Test:
fallback_docs = search_fallback('latest python version 2026')
for doc in fallback_docs[:2]:
    print(f'[{doc["retriever"]}] {doc["content"][:80]}...')

步骤 3: 构建后备感知检索功能

使用置信度检查和回退逻辑包装现有的向量检索。这是您在管道中需要更改的唯一功能。

Python
CONFIDENCE_THRESHOLD = 0.3  # below this, trigger fallback

def retrieve_with_fallback(query: str, vector_store, k: int = 5) -> dict:
    """Retrieve from vector store; fall back to search if confidence is low."""
    # Step 1: Try vector retrieval
    vector_results = vector_store.similarity_search_with_score(query, k=k)
    docs = [doc.page_content for doc, score in vector_results]
    scores = [score for doc, score in vector_results]
    confidence = score_retrieval_quality(query, docs, scores)
    # Step 2: Decide retrieval strategy
    if confidence >= CONFIDENCE_THRESHOLD and docs:
        return {
            'documents': [{'content': d, 'retriever': 'vector'} for d in docs],
            'strategy': 'vector',
            'confidence': round(confidence, 3),
            'search_cost': 0
        }
    # Step 3: Fallback to live search
    search_docs = search_fallback(query, k=k)
    return {
        'documents': search_docs + [{'content': d, 'retriever': 'vector'} for d in docs[:2]],
        'strategy': 'search_fallback',
        'confidence': round(confidence, 3),
        'search_cost': 0.005
    }

步骤 4: 集成到您现有的 RAG 链中

将当前的检索步骤替换为后备感知版本。管道的其余部分(提示构建、LLM 调用、输出解析)保持不变。

Python
def rag_with_fallback(query: str, vector_store, llm) -> dict:
    # Retrieve with fallback
    retrieval = retrieve_with_fallback(query, vector_store)
    documents = retrieval['documents']
    # Build context
    context = '\n\n'.join(d['content'] for d in documents)
    sources = [d.get('source', 'vector store') for d in documents if d.get('source')]
    # Generate answer
    prompt = f"""Answer based on the following context. If using web results, cite the URLs.

Context:
{context}

Question: {query}
Answer:"""
    # Assuming llm is a callable that returns text
    answer = llm(prompt)
    return {
        'answer': answer,
        'strategy': retrieval['strategy'],
        'confidence': retrieval['confidence'],
        'sources': sources,
        'cost': retrieval['search_cost']
    }

# Usage stays the same as before:
# result = rag_with_fallback(user_query, my_vector_store, my_llm)
# print(result['answer'])
# print(f'Strategy: {result["strategy"]}, Cost: ${result["cost"]}')

步骤 5: 监控回退率和成本

跟踪回退触发的频率,以了解矢量存储的覆盖范围差距并计划改进。

Python
from collections import defaultdict

fallback_stats = defaultdict(int)

def tracked_rag(query: str, vector_store, llm) -> dict:
    result = rag_with_fallback(query, vector_store, llm)
    fallback_stats['total'] += 1
    fallback_stats[result['strategy']] += 1
    fallback_stats['total_cost'] += result['cost']
    return result

def print_fallback_report():
    total = fallback_stats['total']
    if total == 0:
        print('No queries tracked yet.')
        return
    vector_pct = fallback_stats.get('vector', 0) / total * 100
    fallback_pct = fallback_stats.get('search_fallback', 0) / total * 100
    print(f'RAG Fallback Report:')
    print(f'  Total queries: {total}')
    print(f'  Vector retrieval: {vector_pct:.0f}%')
    print(f'  Search fallback: {fallback_pct:.0f}%')
    print(f'  Total search cost: ${fallback_stats["total_cost"]:.2f}')
    print(f'  Avg cost/query: ${fallback_stats["total_cost"] / total:.4f}')
    if fallback_pct > 30:
        print(f'  NOTE: High fallback rate. Consider adding more documents to your vector store.')

# After running many queries:
# print_fallback_report()

Python 示例

Python
import os, requests

API_KEY = os.environ['SCAVIO_API_KEY']

def search_fallback(query, k=5):
    resp = requests.post('https://api.scavio.dev/api/v1/search',
        headers={'x-api-key': API_KEY, 'Content-Type': 'application/json'},
        json={'query': query, 'country_code': 'us'})
    return [{'content': f'{r["title"]}\n{r.get("snippet", "")}', 'source': r['link']}
            for r in resp.json().get('organic_results', [])[:k]]

def retrieve(query, vector_docs, confidence):
    if confidence >= 0.3 and vector_docs:
        return {'docs': vector_docs, 'strategy': 'vector', 'cost': 0}
    fallback = search_fallback(query)
    return {'docs': fallback + vector_docs[:2], 'strategy': 'fallback', 'cost': 0.005}

# Simulate low-confidence retrieval:
result = retrieve('Python 3.14 release date 2026', ['Python 3.12 docs...'], 0.15)
print(f'Strategy: {result["strategy"]}, docs: {len(result["docs"])}, cost: ${result["cost"]}')

JavaScript 示例

JavaScript
const API_KEY = process.env.SCAVIO_API_KEY;

async function searchFallback(query, k = 5) {
  const resp = await fetch('https://api.scavio.dev/api/v1/search', {
    method: 'POST',
    headers: { 'x-api-key': API_KEY, 'Content-Type': 'application/json' },
    body: JSON.stringify({ query, country_code: 'us' })
  });
  const data = await resp.json();
  return (data.organic_results || []).slice(0, k)
    .map(r => ({ content: `${r.title}\n${r.snippet || ''}`, source: r.link }));
}

async function retrieve(query, vectorDocs, confidence) {
  if (confidence >= 0.3 && vectorDocs.length) {
    return { docs: vectorDocs, strategy: 'vector', cost: 0 };
  }
  const fallback = await searchFallback(query);
  return { docs: [...fallback, ...vectorDocs.slice(0, 2)], strategy: 'fallback', cost: 0.005 };
}

retrieve('Python 3.14 release 2026', ['old docs'], 0.1)
  .then(r => console.log(`Strategy: ${r.strategy}, docs: ${r.docs.length}`));

预期输出

JSON
Retrieval confidence: 0.12
[live_search] Python Release Python 3.14.0 -- Python 3.14.0 was released on...
[live_search] What's New In Python 3.14 -- This article explains the new...

Strategy: search_fallback, docs: 7, cost: $0.005

RAG Fallback Report:
  Total queries: 100
  Vector retrieval: 72%
  Search fallback: 28%
  Total search cost: $0.14
  Avg cost/query: $0.0014

相关教程

  • 如何通过 SERP 增强来提高 RAG 准确性
  • 如何为任何 Python 代理添加搜索基础
  • 如何构建代理搜索重试链

常见问题

大多数开发者在15到30分钟内完成本教程。您需要一个Scavio API密钥(免费套餐即可)和可用的Python或JavaScript环境。

已安装 Python 3.9+. 具有矢量存储的现有 RAG 管道. 请求已安装库. 来自 scavio.dev 的 Scavio API 密钥. Scavio API密钥注册即送50个免费积分。

可以。免费套餐注册即送50个积分,完全足够完成本教程并构建一个可运行的原型解决方案。

Scavio提供原生LangChain包(langchain-scavio)、MCP服务器以及适用于任何HTTP客户端的REST API。本教程使用 the raw REST API, 但您可以根据需要适配您选择的框架。

相关资源

Best Of

2026年5月LangChain RAG流水线最佳搜索API

Read more
Best Of

2026 年 5 月测试 RAG 搜索质量的最佳工具

Read more
Solution

大型RAG语料库构建方案(1000万Token)

Read more
Solution

用搜索支撑提升RAG回答质量

Read more
Workflow

RAG Corpus 构建工作流程(10M 代币)

Read more
Glossary

爬取 vs 搜索构建 RAG

Read more

开始构建

将搜索 API 后备添加到 RAG 管道。当矢量检索返回低置信度结果时,自动回退到实时网络搜索。

获取免费API密钥阅读文档
ScavioScavio

面向AI智能体的实时搜索API。搜索所有平台,不仅仅是Google。

产品

  • 功能
  • 定价
  • 控制台
  • 联盟计划

开发者

  • 文档
  • API参考
  • 快速开始
  • MCP集成
  • Python SDK

替代方案

  • Tavily替代方案
  • SerpAPI替代方案
  • Firecrawl替代方案
  • Exa替代方案

工具

  • JSON格式化
  • cURL转代码
  • Token计数器
  • 全部工具

© 2026 Scavio. 保留所有权利。

Featured on TAAFT
服务条款隐私政策