ragsearch-apihybrid

Hybrid RAG: Local Index Plus Search API Fallback

RAG systems that only search local docs cannot answer out-of-corpus questions. Add search API fallback for complete coverage at 30% API cost.

5 min read

RAG systems that only search a local document index hit a wall when users ask questions outside the corpus. The system either hallucinates from the LLM's training data or says "I don't know." Both responses push users back to manual search. The fix is a hybrid approach: local-first retrieval for known-domain questions, with automatic fallback to a live search API for everything else.

The confidence threshold pattern

Search your local index first. If the top result's relevance score is above a threshold (typically 0.7), use local results. If below, fall back to the search API. This keeps most queries fast and free (local search) while ensuring coverage for out-of-corpus questions (API search).

Python
import requests, os

H = {'x-api-key': os.environ['SCAVIO_API_KEY']}
THRESHOLD = 0.7

def hybrid_retrieve(query, local_index):
    local = local_index.search(query, limit=3)
    if local.hits and local.hits[0]._rankingScore >= THRESHOLD:
        return {'source': 'local', 'results': [h.content for h in local.hits]}
    web = requests.post('https://api.scavio.dev/api/v1/search', headers=H,
        json={'platform': 'google', 'query': query}, timeout=10).json()
    return {'source': 'web', 'results': [r['snippet'] for r in web.get('organic', [])[:3]]}

Source attribution matters

Label the context with its source: "from internal docs" or "from web search." This lets the LLM attribute its answer correctly and lets users judge the reliability of the source. A user expects different confidence levels from your official documentation versus a general web search result.

When to use which

Local index for: product docs, FAQ, pricing, internal policies, historical data. Search API for: competitor info, industry news, community sentiment, anything that changes faster than your indexing cycle. The split is not about quality but about staleness. If the information changes weekly, it belongs in the API path.

Cost efficiency

If 70% of queries are answered by the local index, only 30% hit the search API. At 1,000 queries per day, that is 300 API calls at $0.005 each, or $1.50 per day. The local index runs for free on your own infrastructure. Hybrid retrieval gives you the coverage of API search at 30% of the cost.