Tutorial

How to Build a GEO Content Optimizer

Learn how to build a tool that analyzes what content gets cited in Google AI Overviews and optimizes your pages to appear as sources.

Generative Engine Optimization (GEO) is about getting your content cited as a source in AI-generated answers. Google's AI Overviews are the most measurable GEO surface: they show explicit citations that you can track programmatically. This tutorial builds a tool that analyzes which pages get cited for your target keywords, identifies patterns in cited content, and suggests optimizations for your own pages.

Prerequisites

  • Python 3.8+ installed
  • requests library installed
  • A Scavio API key from scavio.dev
  • A list of target keywords

Walkthrough

Step 1: Analyze AI Overview sources for target keywords

For each keyword, check which domains and page types get cited.

Python
import requests, os
H = {'x-api-key': os.environ['SCAVIO_API_KEY']}

def analyze_citations(keyword: str) -> dict:
    resp = requests.post('https://api.scavio.dev/api/v1/search', headers=H,
        json={'platform': 'google', 'query': keyword}, timeout=10)
    data = resp.json()
    ai_overview = data.get('ai_overview', {})
    sources = ai_overview.get('sources', [])
    return {
        'keyword': keyword,
        'has_ai_overview': bool(ai_overview),
        'source_count': len(sources),
        'sources': [{'domain': s.get('link', '').split('/')[2] if '//' in s.get('link', '') else '',
                     'title': s.get('title', ''), 'url': s.get('link', '')} for s in sources],
    }

Step 2: Identify citation patterns

Aggregate source data across all keywords to find what types of pages get cited most.

Python
from collections import Counter

def find_patterns(analyses: list) -> dict:
    all_domains = []
    all_titles = []
    for a in analyses:
        for s in a['sources']:
            all_domains.append(s['domain'])
            all_titles.append(s['title'].lower())
    domain_counts = Counter(all_domains).most_common(10)
    # Look for content type patterns in titles
    patterns = {'comparison': 0, 'review': 0, 'guide': 0, 'list': 0, 'how_to': 0}
    for title in all_titles:
        if 'vs' in title or 'comparison' in title: patterns['comparison'] += 1
        if 'review' in title: patterns['review'] += 1
        if 'guide' in title: patterns['guide'] += 1
        if 'best' in title or 'top' in title: patterns['list'] += 1
        if 'how to' in title: patterns['how_to'] += 1
    return {'top_domains': domain_counts, 'content_patterns': patterns}

Step 3: Generate optimization suggestions

Based on citation patterns, suggest content types and structures that are more likely to be cited.

Python
def suggest_optimizations(patterns: dict, my_domain: str) -> list:
    suggestions = []
    content_patterns = patterns['content_patterns']
    top_type = max(content_patterns, key=content_patterns.get)
    suggestions.append(f'Most cited content type: {top_type} ({content_patterns[top_type]} citations). Prioritize publishing {top_type} content.')
    top_domains = [d for d, _ in patterns['top_domains']]
    if my_domain in top_domains:
        rank = top_domains.index(my_domain) + 1
        suggestions.append(f'Your domain ranks #{rank} in citation frequency. Focus on keywords where you are not yet cited.')
    else:
        suggestions.append(f'Your domain does not appear in top 10 cited domains. Focus on structured content with clear headings, tables, and FAQ sections.')
    suggestions.append('Add FAQ schema markup to improve extraction by AI systems.')
    suggestions.append('Include comparison tables with clear column headers for product/feature comparisons.')
    return suggestions

Step 4: Run the full analysis

Process all keywords and generate a complete GEO optimization report.

Python
KEYWORDS = ['best crm 2026', 'project management tool comparison', 'invoice software for freelancers']

def geo_report(keywords: list, my_domain: str) -> dict:
    analyses = [analyze_citations(kw) for kw in keywords]
    patterns = find_patterns(analyses)
    suggestions = suggest_optimizations(patterns, my_domain)
    return {
        'keywords_analyzed': len(keywords),
        'with_ai_overview': sum(1 for a in analyses if a['has_ai_overview']),
        'patterns': patterns,
        'suggestions': suggestions,
        'details': analyses
    }

report = geo_report(KEYWORDS, 'mydomain.com')
for s in report['suggestions']: print(f'- {s}')

Python Example

Python
import requests, os
H = {'x-api-key': os.environ['SCAVIO_API_KEY']}

def geo_analyze(keywords):
    for kw in keywords:
        data = requests.post('https://api.scavio.dev/api/v1/search', headers=H,
            json={'platform': 'google', 'query': kw}, timeout=10).json()
        sources = data.get('ai_overview', {}).get('sources', [])
        domains = [s.get('link', '').split('/')[2] for s in sources if '//' in s.get('link', '')]
        print(f'{kw}: {len(sources)} AI Overview sources: {domains}')

JavaScript Example

JavaScript
async function geoAnalyze(keywords) {
  for (const kw of keywords) {
    const data = await fetch('https://api.scavio.dev/api/v1/search', {
      method: 'POST', headers: {'x-api-key': process.env.SCAVIO_API_KEY, 'Content-Type': 'application/json'},
      body: JSON.stringify({platform: 'google', query: kw})
    }).then(r => r.json());
    const sources = data.ai_overview?.sources || [];
    console.log(`${kw}: ${sources.length} sources`);
  }
}

Expected Output

JSON
A GEO analysis report showing citation patterns across target keywords with actionable optimization suggestions.

Related Tutorials

Frequently Asked Questions

Most developers complete this tutorial in 15 to 30 minutes. You will need a Scavio API key (free tier works) and a working Python or JavaScript environment.

Python 3.8+ installed. requests library installed. A Scavio API key from scavio.dev. A list of target keywords. A Scavio API key gives you 500 free credits per month.

Yes. The free tier includes 500 credits per month, which is more than enough to complete this tutorial and prototype a working solution.

Scavio has a native LangChain package (langchain-scavio), an MCP server, and a plain REST API that works with any HTTP client. This tutorial uses the raw REST API, but you can adapt to your framework of choice.

Start Building

Learn how to build a tool that analyzes what content gets cited in Google AI Overviews and optimizes your pages to appear as sources.