Tutorial

How to Extract Google AI Overviews Programmatically at Scale

Monitor Google AI Overviews across hundreds of keywords using the Scavio API. Track which queries trigger AI summaries and what sources Google cites.

Google AI Overviews now appear for a significant share of informational queries, and their presence directly impacts organic click-through rates. SEO teams need to know which of their target keywords trigger AI Overviews, what the overview says, and which sources are cited. This tutorial builds a batch monitoring system that checks hundreds of keywords for AI Overview presence, extracts the summary text and cited sources, and outputs a report showing coverage trends over time.

Prerequisites

  • Python 3.10 or higher
  • requests library installed
  • A Scavio API key
  • A keyword list to monitor (CSV or Python list)

Walkthrough

Step 1: Load keywords to monitor

Read keywords from a CSV file or define them inline. Each keyword will be checked for AI Overview presence.

Python
import csv

def load_keywords(path: str) -> list[str]:
    with open(path) as f:
        reader = csv.reader(f)
        return [row[0] for row in reader]

# Or define inline
KEYWORDS = ["what is rag", "how to fine tune llm", "python vs rust 2026"]

Step 2: Check AI Overview for each keyword

Query each keyword with ai_overview enabled and extract the overview text and sources if present.

Python
def check_ai_overview(keyword: str) -> dict:
    r = requests.post(
        "https://api.scavio.dev/api/v1/search",
        headers={"x-api-key": API_KEY},
        json={"query": keyword, "country_code": "us", "ai_overview": True}
    )
    r.raise_for_status()
    data = r.json()
    ai = data.get("ai_overview")
    return {
        "keyword": keyword,
        "has_overview": ai is not None,
        "text": ai.get("text", "") if ai else "",
        "sources": ai.get("sources", []) if ai else [],
    }

Step 3: Batch process all keywords

Run the check across all keywords with rate limiting. Track progress and handle errors gracefully.

Python
import time

def batch_check(keywords: list[str]) -> list[dict]:
    results = []
    for i, kw in enumerate(keywords):
        try:
            result = check_ai_overview(kw)
            results.append(result)
        except Exception as e:
            results.append({"keyword": kw, "has_overview": None, "error": str(e)})
        if i % 10 == 9:
            time.sleep(1)
    return results

Step 4: Generate the coverage report

Summarize what percentage of keywords trigger AI Overviews and which domains are most frequently cited.

Python
from collections import Counter

def report(results: list[dict]) -> None:
    total = len(results)
    with_overview = sum(1 for r in results if r.get("has_overview"))
    print(f"AI Overview coverage: {with_overview}/{total} ({with_overview/total*100:.0f}%)")
    all_sources = []
    for r in results:
        for s in r.get("sources", []):
            domain = s.get("link", "").split("/")[2] if s.get("link") else "unknown"
            all_sources.append(domain)
    print("\nTop cited domains:")
    for domain, count in Counter(all_sources).most_common(10):
        print(f"  {domain}: {count} citations")

Python Example

Python
import os
import time
import json
import requests
from collections import Counter

API_KEY = os.environ.get("SCAVIO_API_KEY", "your_scavio_api_key")
ENDPOINT = "https://api.scavio.dev/api/v1/search"
KEYWORDS = ["what is rag", "how to fine tune llm", "python vs rust 2026", "best vector database"]

def check(kw: str) -> dict:
    r = requests.post(ENDPOINT, headers={"x-api-key": API_KEY},
                      json={"query": kw, "country_code": "us", "ai_overview": True})
    r.raise_for_status()
    ai = r.json().get("ai_overview")
    return {"keyword": kw, "has_overview": ai is not None,
            "text": (ai or {}).get("text", "")[:200],
            "sources": (ai or {}).get("sources", [])}

if __name__ == "__main__":
    results = []
    for kw in KEYWORDS:
        results.append(check(kw))
        time.sleep(0.5)
    with_ai = sum(1 for r in results if r["has_overview"])
    print(f"Coverage: {with_ai}/{len(results)}")
    for r in results:
        status = "YES" if r["has_overview"] else "NO"
        print(f"  [{status}] {r['keyword']}")

JavaScript Example

JavaScript
const API_KEY = process.env.SCAVIO_API_KEY || "your_scavio_api_key";
const ENDPOINT = "https://api.scavio.dev/api/v1/search";

async function check(kw) {
  const res = await fetch(ENDPOINT, {
    method: "POST",
    headers: { "x-api-key": API_KEY, "Content-Type": "application/json" },
    body: JSON.stringify({ query: kw, country_code: "us", ai_overview: true })
  });
  const data = await res.json();
  return { keyword: kw, hasOverview: !!data.ai_overview, text: data.ai_overview?.text?.slice(0, 200) || "" };
}

async function main() {
  const keywords = ["what is rag", "how to fine tune llm", "python vs rust 2026"];
  const results = [];
  for (const kw of keywords) {
    results.push(await check(kw));
  }
  const count = results.filter(r => r.hasOverview).length;
  console.log(`AI Overview coverage: ${count}/${results.length}`);
  results.forEach(r => console.log(`  [${r.hasOverview ? "YES" : "NO"}] ${r.keyword}`));
}
main().catch(console.error);

Expected Output

JSON
AI Overview coverage: 3/4 (75%)
  [YES] what is rag
  [YES] how to fine tune llm
  [YES] python vs rust 2026
  [NO] best vector database

Top cited domains:
  aws.amazon.com: 3 citations
  docs.python.org: 2 citations
  huggingface.co: 2 citations

Related Tutorials

Frequently Asked Questions

Most developers complete this tutorial in 15 to 30 minutes. You will need a Scavio API key (free tier works) and a working Python or JavaScript environment.

Python 3.10 or higher. requests library installed. A Scavio API key. A keyword list to monitor (CSV or Python list). A Scavio API key gives you 500 free credits per month.

Yes. The free tier includes 500 credits per month, which is more than enough to complete this tutorial and prototype a working solution.

Scavio has a native LangChain package (langchain-scavio), an MCP server, and a plain REST API that works with any HTTP client. This tutorial uses the raw REST API, but you can adapt to your framework of choice.

Start Building

Monitor Google AI Overviews across hundreds of keywords using the Scavio API. Track which queries trigger AI summaries and what sources Google cites.