Workflow

Agent Search Quality Monitoring Workflow

Workflow that monitors the quality of search results served to AI agents. Detects degradation, empty results, and stale data before agents produce bad outputs.

Overview

AI agents silently degrade when search quality drops: empty results, stale data, or irrelevant matches. Teams only notice when users complain about bad agent outputs days later. This workflow runs hourly quality checks against a set of benchmark queries and alerts when search quality falls below threshold.

Trigger

Hourly cron, every hour on the hour.

Schedule

Hourly

Workflow Steps

1

Load Benchmark Query Set

Read the set of benchmark queries with expected minimum result counts and known-good URLs that should appear in results.

2

Execute Benchmark Queries

Run each benchmark query via the search API. Record result count, latency, and whether known-good URLs appear.

3

Score Quality Metrics

Calculate quality score: result count vs expected, known-URL hit rate, average latency. Compare against thresholds.

4

Alert on Degradation

If quality score drops below threshold, alert the engineering team via Slack with specific failing queries and metrics.

Python Implementation

Python
import requests, os, time
from datetime import datetime

API_KEY = os.environ["SCAVIO_API_KEY"]

BENCHMARKS = [
    {"query": "python web framework 2026", "min_results": 5, "known_url": "docs.python.org"},
    {"query": "react documentation", "min_results": 5, "known_url": "react.dev"},
]

def quality_check() -> dict:
    scores = []
    for bench in BENCHMARKS:
        start = time.time()
        resp = requests.post(
            "https://api.scavio.dev/api/v1/search",
            headers={"x-api-key": API_KEY, "Content-Type": "application/json"},
            json={"query": bench["query"], "country_code": "us"},
            timeout=10,
        )
        latency = time.time() - start
        data = resp.json()
        results = data.get("organic_results", [])
        urls = [r.get("link", "") for r in results]
        known_hit = any(bench["known_url"] in u for u in urls)
        score = 1.0 if len(results) >= bench["min_results"] and known_hit else 0.5 if len(results) >= bench["min_results"] else 0.0
        scores.append({"query": bench["query"], "results": len(results), "known_hit": known_hit, "latency": round(latency, 2), "score": score})
    avg_score = sum(s["score"] for s in scores) / len(scores)
    return {"timestamp": datetime.now().isoformat(), "avg_score": round(avg_score, 2), "checks": scores, "healthy": avg_score >= 0.7}

report = quality_check()
print(f"Quality: {report['avg_score']} ({'OK' if report['healthy'] else 'DEGRADED'})")
for c in report["checks"]:
    print(f"  {c['query']}: {c['results']} results, known_hit={c['known_hit']}, {c['latency']}s")

JavaScript Implementation

JavaScript
const H = {'x-api-key': process.env.SCAVIO_API_KEY, 'Content-Type': 'application/json'};
const BENCHMARKS = [{query:'python web framework 2026', minResults:5, knownUrl:'docs.python.org'},{query:'react documentation', minResults:5, knownUrl:'react.dev'}];
async function qualityCheck() {
  const scores = [];
  for (const b of BENCHMARKS) {
    const start = Date.now();
    const r = await fetch('https://api.scavio.dev/api/v1/search', {method:'POST', headers:H, body:JSON.stringify({query:b.query, country_code:'us'})});
    const d = await r.json();
    const results = d.organic_results||[];
    const knownHit = results.some(r=>(r.link||'').includes(b.knownUrl));
    const score = results.length>=b.minResults && knownHit ? 1.0 : results.length>=b.minResults ? 0.5 : 0.0;
    scores.push({query:b.query, results:results.length, knownHit, latency:Date.now()-start, score});
  }
  const avg = scores.reduce((s,c)=>s+c.score,0)/scores.length;
  return {avgScore:avg.toFixed(2), healthy:avg>=0.7, checks:scores};
}
const r = await qualityCheck();
console.log('Quality: '+r.avgScore+' ('+(r.healthy?'OK':'DEGRADED')+')');

Platforms Used

Google

Web search with knowledge graph, PAA, and AI overviews

Frequently Asked Questions

AI agents silently degrade when search quality drops: empty results, stale data, or irrelevant matches. Teams only notice when users complain about bad agent outputs days later. This workflow runs hourly quality checks against a set of benchmark queries and alerts when search quality falls below threshold.

This workflow uses a hourly cron, every hour on the hour.. Hourly.

This workflow uses the following Scavio platforms: google. Each platform is called via the same unified API endpoint.

Yes. Scavio's free tier includes 250 credits per month with no credit card required. That is enough to test and validate this workflow before scaling it.

Agent Search Quality Monitoring Workflow

Workflow that monitors the quality of search results served to AI agents. Detects degradation, empty results, and stale data before agents produce bad outputs.