The Problem
Production AI agents that rely on a single search API provider face downtime when that provider has outages, rate limits, or degraded performance. A single point of failure in the search layer means the entire agent pipeline stops producing grounded results.
The Scavio Solution
Implement a failover cluster with Scavio as the primary search backend and a circuit breaker pattern that routes to a fallback provider after consecutive failures. Track response times and error rates per provider, trip the circuit breaker after 3 consecutive errors or p95 latency exceeding 5 seconds, and automatically restore the primary when health checks pass again.
Before
Before failover, the agent used a single SERP API. During a 45-minute provider outage, the agent fell back to ungrounded LLM responses, producing hallucinated pricing data that a customer noticed and reported.
After
After implementing failover, the circuit breaker detected the outage within 15 seconds (3 consecutive failures). Traffic switched to the secondary provider automatically. The agent continued producing grounded results with no customer-visible impact. When the primary recovered, traffic restored within 2 minutes.
Who It Is For
DevOps engineers and backend developers building production AI agent pipelines that require high-availability search grounding.
Key Benefits
- Zero-downtime search for production agents
- Automatic detection and routing around provider failures
- Health check restoration prevents manual intervention
- Response time monitoring catches degradation before full failure
- Provider-agnostic response normalization
Python Example
import requests, os, time
from collections import deque
H = {'x-api-key': os.environ['SCAVIO_API_KEY']}
FAILURES = deque(maxlen=3)
CIRCUIT_OPEN = False
CIRCUIT_OPENED_AT = 0
COOLDOWN = 60 # seconds before retrying primary
def search_with_failover(query: str, platform: str = 'google') -> dict:
global CIRCUIT_OPEN, CIRCUIT_OPENED_AT
if CIRCUIT_OPEN and time.time() - CIRCUIT_OPENED_AT < COOLDOWN:
return _fallback_search(query, platform)
if CIRCUIT_OPEN:
CIRCUIT_OPEN = False
FAILURES.clear()
try:
start = time.time()
r = requests.post('https://api.scavio.dev/api/v1/search', headers=H,
json={'platform': platform, 'query': query}, timeout=10)
r.raise_for_status()
latency = time.time() - start
if latency > 5:
FAILURES.append('slow')
else:
FAILURES.clear()
return r.json()
except Exception:
FAILURES.append('error')
if len(FAILURES) >= 3:
CIRCUIT_OPEN = True
CIRCUIT_OPENED_AT = time.time()
return _fallback_search(query, platform)
raise
def _fallback_search(query: str, platform: str) -> dict:
# Replace with your fallback provider
return {'source': 'fallback', 'query': query, 'results': []}JavaScript Example
const failures = [];
let circuitOpen = false;
let circuitOpenedAt = 0;
const COOLDOWN = 60_000;
async function searchWithFailover(query, platform = 'google') {
if (circuitOpen && Date.now() - circuitOpenedAt < COOLDOWN) {
return fallbackSearch(query, platform);
}
if (circuitOpen) { circuitOpen = false; failures.length = 0; }
try {
const start = Date.now();
const r = await fetch('https://api.scavio.dev/api/v1/search', {
method: 'POST',
headers: { 'x-api-key': process.env.SCAVIO_API_KEY, 'Content-Type': 'application/json' },
body: JSON.stringify({ platform, query })
});
if (!r.ok) throw new Error(`HTTP ${r.status}`);
const latency = Date.now() - start;
if (latency > 5000) failures.push('slow');
else failures.length = 0;
return r.json();
} catch (e) {
failures.push('error');
if (failures.length >= 3) {
circuitOpen = true;
circuitOpenedAt = Date.now();
return fallbackSearch(query, platform);
}
throw e;
}
}
async function fallbackSearch(query, platform) {
return { source: 'fallback', query, results: [] };
}Platforms Used
Web search with knowledge graph, PAA, and AI overviews
Amazon
Product search with prices, ratings, and reviews
YouTube
Video search with transcripts and metadata
Community, posts & threaded comments from any subreddit