Tutorial

How to Verify AI Search Results Programmatically

Build an automated pipeline that cross-checks AI search answers against multiple sources. Detect hallucinations before they reach users.

AI search engines like Perplexity and ChatGPT Search sometimes present fabricated or outdated information as fact. This tutorial builds a verification pipeline that takes an AI-generated answer, extracts its factual claims, and cross-checks each claim against fresh SERP results. The pipeline flags unverified claims and assigns a trust score. Cost: 1-3 searches per verification at $0.005-0.015.

Prerequisites

  • Python 3.9+ installed
  • requests library installed
  • A Scavio API key from scavio.dev

Walkthrough

Step 1: Extract factual claims from AI-generated text

Parse an AI answer to identify specific factual claims that can be verified. Focus on numbers, dates, names, and specific assertions.

Python
import re

def extract_claims(text: str) -> list:
    """Extract verifiable claims from AI-generated text."""
    claims = []
    sentences = re.split(r'[.!?]\s+', text)
    for sentence in sentences:
        sentence = sentence.strip()
        if not sentence or len(sentence) < 20:
            continue
        # Claims with numbers/dates are verifiable
        has_number = bool(re.search(r'\d+', sentence))
        # Claims with proper nouns
        has_proper = bool(re.search(r'[A-Z][a-z]+(?:\s[A-Z][a-z]+)*', sentence))
        # Claims with comparison words
        has_comparison = any(w in sentence.lower() for w in
            ['fastest', 'largest', 'best', 'most', 'first', 'only', 'latest'])
        if has_number or has_comparison:
            claims.append({'text': sentence, 'type': 'numeric' if has_number else 'comparative'})
        elif has_proper:
            claims.append({'text': sentence, 'type': 'factual'})
    return claims

# Example AI answer to verify
ai_answer = """Python 3.14 was released in October 2025 with a new JIT compiler.
It is 2x faster than Python 3.12 for numeric workloads.
Guido van Rossum announced the change at PyCon 2025."""

claims = extract_claims(ai_answer)
for c in claims:
    print(f'  [{c["type"]}] {c["text"]}')

Step 2: Cross-check claims against live search results

For each claim, search the web and check if the SERP results support or contradict it.

Python
import requests, os

SCAVIO_KEY = os.environ['SCAVIO_API_KEY']

def verify_claim(claim: str) -> dict:
    # Build a verification query from the claim
    query = claim[:100]  # truncate long claims
    resp = requests.post('https://api.scavio.dev/api/v1/search',
        headers={'x-api-key': SCAVIO_KEY, 'Content-Type': 'application/json'},
        json={'query': query, 'country_code': 'us', 'num_results': 5})
    results = resp.json().get('organic_results', [])
    if not results:
        return {'claim': claim, 'status': 'unverified', 'confidence': 0, 'sources': []}
    # Check if results support the claim
    claim_lower = claim.lower()
    claim_keywords = set(re.findall(r'\b\w{4,}\b', claim_lower))
    support_count = 0
    sources = []
    for r in results:
        text = (r.get('title', '') + ' ' + r.get('snippet', '')).lower()
        result_words = set(re.findall(r'\b\w{4,}\b', text))
        overlap = len(claim_keywords & result_words) / max(len(claim_keywords), 1)
        if overlap > 0.3:
            support_count += 1
            sources.append({'title': r['title'][:50], 'url': r['link']})
    confidence = min(support_count / 3 * 100, 100)
    status = 'verified' if confidence >= 50 else 'disputed' if confidence >= 20 else 'unverified'
    return {'claim': claim[:60], 'status': status,
            'confidence': round(confidence), 'sources': sources[:2]}

Step 3: Run the full verification pipeline and compute trust score

Verify all claims from an AI answer and compute an overall trust score for the response.

Python
import time

def verify_answer(ai_text: str) -> dict:
    claims = extract_claims(ai_text)
    if not claims:
        return {'trust_score': 0, 'claims': [], 'note': 'No verifiable claims found'}
    verified_claims = []
    for claim in claims:
        result = verify_claim(claim['text'])
        result['type'] = claim['type']
        verified_claims.append(result)
        time.sleep(0.3)
    # Compute trust score
    verified = len([c for c in verified_claims if c['status'] == 'verified'])
    disputed = len([c for c in verified_claims if c['status'] == 'disputed'])
    total = len(verified_claims)
    trust_score = round(verified / total * 100) if total > 0 else 0
    cost = total * 0.005
    print(f'Trust Score: {trust_score}/100')
    print(f'Claims: {total} total, {verified} verified, {disputed} disputed')
    print(f'Verification cost: ${cost:.3f}\n')
    for c in verified_claims:
        icon = 'PASS' if c['status'] == 'verified' else 'WARN' if c['status'] == 'disputed' else 'FAIL'
        print(f'  [{icon}] {c["claim"]}')
        if c['sources']:
            print(f'         Source: {c["sources"][0]["title"]}')
    return {'trust_score': trust_score, 'claims': verified_claims, 'cost': cost}

result = verify_answer(ai_answer)

Python Example

Python
import requests, os, re, time

SCAVIO_KEY = os.environ['SCAVIO_API_KEY']

def verify(claim):
    resp = requests.post('https://api.scavio.dev/api/v1/search',
        headers={'x-api-key': SCAVIO_KEY, 'Content-Type': 'application/json'},
        json={'query': claim[:100], 'country_code': 'us', 'num_results': 5})
    results = resp.json().get('organic_results', [])
    keywords = set(re.findall(r'\b\w{4,}\b', claim.lower()))
    support = sum(1 for r in results
        if len(keywords & set(re.findall(r'\b\w{4,}\b', (r.get('snippet','')).lower()))) > len(keywords)*0.3)
    return 'verified' if support >= 2 else 'unverified'

claims = ['Python 3.14 released October 2025', 'Guido van Rossum at PyCon 2025']
for c in claims:
    print(f'{verify(c)}: {c}')
    time.sleep(0.3)

JavaScript Example

JavaScript
const SCAVIO_KEY = process.env.SCAVIO_API_KEY;

async function verify(claim) {
  const resp = await fetch('https://api.scavio.dev/api/v1/search', {
    method: 'POST',
    headers: { 'x-api-key': SCAVIO_KEY, 'Content-Type': 'application/json' },
    body: JSON.stringify({ query: claim.slice(0, 100), country_code: 'us', num_results: 5 })
  });
  const results = (await resp.json()).organic_results || [];
  const keywords = new Set(claim.toLowerCase().match(/\b\w{4,}/g) || []);
  const support = results.filter(r => {
    const words = new Set((r.snippet || '').toLowerCase().match(/\b\w{4,}/g) || []);
    return [...keywords].filter(k => words.has(k)).length > keywords.size * 0.3;
  }).length;
  return support >= 2 ? 'verified' : 'unverified';
}

verify('Python 3.14 released October 2025').then(r => console.log(r));

Expected Output

JSON
  [numeric] Python 3.14 was released in October 2025 with a new JIT compiler
  [numeric] It is 2x faster than Python 3.12 for numeric workloads
  [factual] Guido van Rossum announced the change at PyCon 2025

Trust Score: 67/100
Claims: 3 total, 2 verified, 1 disputed
Verification cost: $0.015

  [PASS] Python 3.14 was released in October 2025 with a new JI
         Source: Python 3.14 Release Notes - docs.python.org
  [WARN] It is 2x faster than Python 3.12 for numeric workloads
  [PASS] Guido van Rossum announced the change at PyCon 2025
         Source: PyCon 2025 Keynote - Guido van Rossum

Related Tutorials

Frequently Asked Questions

Most developers complete this tutorial in 15 to 30 minutes. You will need a Scavio API key (free tier works) and a working Python or JavaScript environment.

Python 3.9+ installed. requests library installed. A Scavio API key from scavio.dev. A Scavio API key gives you 250 free credits per month.

Yes. The free tier includes 250 credits per month, which is more than enough to complete this tutorial and prototype a working solution.

Scavio has a native LangChain package (langchain-scavio), an MCP server, and a plain REST API that works with any HTTP client. This tutorial uses the raw REST API, but you can adapt to your framework of choice.

Start Building

Build an automated pipeline that cross-checks AI search answers against multiple sources. Detect hallucinations before they reach users.