The Problem
RAG pipelines retrieve from static vector stores that were indexed days or weeks ago. When users ask about recent events, pricing changes, or new releases, the retriever returns stale or irrelevant chunks. Pure vector search misses information that was published after the last index build.
The Scavio Solution
Add a live web search step that runs alongside vector retrieval. When the vector store confidence is low or the query contains time-sensitive signals (pricing, latest, 2026), fall back to Scavio search for fresh data. Merge both result sets before passing to the LLM.
Before
RAG pipeline returns chunks from a 2-week-old index about a product whose pricing changed yesterday. LLM generates an answer with wrong pricing.
After
Hybrid pipeline detects time-sensitive query, fetches live pricing via search API, merges with vector results, and LLM generates accurate answer with citation.
Who It Is For
AI engineers building RAG pipelines who need to handle time-sensitive queries without rebuilding the entire vector index.
Key Benefits
- Fallback to live search when vector confidence is low
- Time-sensitive query detection triggers web search
- Merge vector and web results for comprehensive context
- Cost: $0.005 per web search fallback
- Works with any vector store (Pinecone, Weaviate, Chroma)
Python Example
import requests, os
API_KEY = os.environ["SCAVIO_API_KEY"]
TIME_SIGNALS = ["latest", "2026", "pricing", "current", "new release", "update"]
def needs_web_search(query: str, vector_score: float) -> bool:
"""Detect if query needs live web data."""
if vector_score < 0.75:
return True
return any(signal in query.lower() for signal in TIME_SIGNALS)
def web_search_fallback(query: str) -> list:
resp = requests.post(
"https://api.scavio.dev/api/v1/search",
headers={"x-api-key": API_KEY, "Content-Type": "application/json"},
json={"query": query, "country_code": "us"},
timeout=10,
)
data = resp.json()
return [
{"text": r.get("snippet", ""), "url": r.get("link", ""), "source": "web"}
for r in data.get("organic_results", [])[:5]
]
def hybrid_retrieve(query: str, vector_results: list, vector_score: float) -> list:
"""Merge vector and web results for hybrid RAG."""
results = [{"text": r["text"], "source": "vector"} for r in vector_results]
if needs_web_search(query, vector_score):
results.extend(web_search_fallback(query))
return results
# Example usage
chunks = hybrid_retrieve("latest stripe pricing 2026", vector_results=[], vector_score=0.4)
print(f"Retrieved {len(chunks)} chunks ({sum(1 for c in chunks if c['source']=='web')} from web)")JavaScript Example
const H = {'x-api-key': process.env.SCAVIO_API_KEY, 'Content-Type': 'application/json'};
const TIME_SIGNALS = ['latest','2026','pricing','current','new release'];
async function webFallback(query) {
const r = await fetch('https://api.scavio.dev/api/v1/search', {method:'POST', headers:H, body:JSON.stringify({query, country_code:'us'})});
const d = await r.json();
return (d.organic_results||[]).slice(0,5).map(r=>({text:r.snippet, url:r.link, source:'web'}));
}
async function hybridRetrieve(query, vectorResults, vectorScore) {
const results = vectorResults.map(r=>({...r, source:'vector'}));
if (vectorScore < 0.75 || TIME_SIGNALS.some(s=>query.toLowerCase().includes(s))) results.push(...await webFallback(query));
return results;
}
const chunks = await hybridRetrieve('latest stripe pricing 2026', [], 0.4);
console.log(chunks.length + ' chunks retrieved');Platforms Used
Web search with knowledge graph, PAA, and AI overviews