Tutorial

How to Build an Intent-Based Lead Pipeline with Search Data

Build a lead generation pipeline that finds prospects showing buying intent instead of scraping volume lists. Search-powered intent signals for B2B sales.

Volume-based lead gen fills your CRM with contacts who are not buying. Intent-based lead gen finds people actively searching for solutions like yours. By monitoring search results for buying-intent queries, you can identify companies publishing comparison articles, asking for recommendations on Reddit, or researching competitors. This tutorial builds an intent signal pipeline using the Scavio API at $0.005 per search across Google and Reddit.

Prerequisites

  • Python 3.9+ installed
  • requests library installed
  • A Scavio API key from scavio.dev
  • A clear ICP (ideal customer profile) for targeting

Walkthrough

Step 1: Define intent signal queries

Create search queries that reveal buying intent in your market. These are queries your ideal customers type when they are actively evaluating solutions.

Python
import os, requests, time

SCAVIO_KEY = os.environ['SCAVIO_API_KEY']
URL = 'https://api.scavio.dev/api/v1/search'
H = {'x-api-key': SCAVIO_KEY, 'Content-Type': 'application/json'}

def build_intent_queries(product_category: str, competitors: list) -> list:
    """Generate buying-intent search queries."""
    queries = []
    # Direct comparison queries
    for comp in competitors:
        queries.append({'query': f'{comp} alternative 2026', 'intent': 'switching', 'signal': 'high'})
        queries.append({'query': f'{comp} vs', 'intent': 'comparing', 'signal': 'high'})
        queries.append({'query': f'{comp} pricing too expensive', 'intent': 'price_sensitive', 'signal': 'high'})
    # Category queries
    queries.append({'query': f'best {product_category} 2026', 'intent': 'researching', 'signal': 'medium'})
    queries.append({'query': f'{product_category} for enterprise', 'intent': 'enterprise', 'signal': 'medium'})
    # Reddit intent queries
    queries.append({'query': f'site:reddit.com recommend {product_category}', 'intent': 'asking', 'signal': 'high'})
    queries.append({'query': f'site:reddit.com {product_category} frustrating', 'intent': 'pain_point', 'signal': 'high'})
    return queries

queries = build_intent_queries('search API', ['SerpAPI', 'Tavily'])
for q in queries:
    print(f'[{q["signal"]:6s}] {q["intent"]:15s} | {q["query"]}')

Step 2: Search and extract intent signals

Run the intent queries and extract leads from the results. Companies writing comparison articles, posting on Reddit, or publishing reviews are showing active buying intent.

Python
import re

def extract_leads_from_results(results: list, query_info: dict) -> list:
    """Extract potential leads from search results."""
    leads = []
    for r in results:
        title = r.get('title', '')
        snippet = r.get('snippet', '')
        url = r.get('link', '')
        # Extract domain as potential lead
        domain_match = re.search(r'https?://(?:www\.)?([\w.-]+)', url)
        domain = domain_match.group(1) if domain_match else ''
        # Skip aggregator/directory sites
        skip_domains = ['reddit.com', 'quora.com', 'wikipedia.org', 'youtube.com',
                       'g2.com', 'capterra.com', 'medium.com']
        if any(d in domain for d in skip_domains):
            # Reddit posts are leads for community outreach, not company leads
            if 'reddit.com' in domain:
                leads.append({
                    'type': 'community',
                    'platform': 'reddit',
                    'title': title,
                    'url': url,
                    'intent': query_info['intent'],
                    'signal': query_info['signal'],
                })
            continue
        leads.append({
            'type': 'company',
            'domain': domain,
            'title': title,
            'snippet': snippet[:200],
            'url': url,
            'intent': query_info['intent'],
            'signal': query_info['signal'],
        })
    return leads

def scan_intent_signals(queries: list) -> list:
    all_leads = []
    for q in queries:
        resp = requests.post(URL, headers=H,
            json={'query': q['query'], 'country_code': 'us', 'num_results': 5})
        results = resp.json().get('organic_results', [])
        leads = extract_leads_from_results(results, q)
        all_leads.extend(leads)
        time.sleep(0.3)
    return all_leads

leads = scan_intent_signals(queries[:5])  # Test with first 5 queries
print(f'Found {len(leads)} intent signals')
for l in leads[:5]:
    print(f'  [{l["signal"]}] {l["type"]}: {l.get("domain", l.get("platform", ""))} - {l["intent"]}')

Step 3: Score and rank leads by intent strength

Assign scores based on intent type, signal strength, and recency. High-intent leads who are actively comparing solutions rank highest.

Python
def score_leads(leads: list) -> list:
    """Score leads by intent strength."""
    intent_scores = {
        'switching': 10, 'price_sensitive': 9, 'pain_point': 8,
        'comparing': 7, 'asking': 6, 'researching': 4, 'enterprise': 5,
    }
    signal_multiplier = {'high': 1.5, 'medium': 1.0, 'low': 0.5}
    for lead in leads:
        base = intent_scores.get(lead['intent'], 3)
        mult = signal_multiplier.get(lead['signal'], 1.0)
        lead['score'] = base * mult
    # Sort by score descending
    leads.sort(key=lambda x: x['score'], reverse=True)
    return leads

def deduplicate_leads(leads: list) -> list:
    """Deduplicate by domain, keeping highest-scored entry."""
    seen = {}
    for lead in leads:
        key = lead.get('domain', lead.get('url', ''))
        if key not in seen or lead['score'] > seen[key]['score']:
            seen[key] = lead
    return sorted(seen.values(), key=lambda x: x['score'], reverse=True)

scored = score_leads(leads)
unique = deduplicate_leads(scored)
print(f'Unique leads: {len(unique)}')
print(f'\nTop intent signals:')
for l in unique[:10]:
    domain = l.get('domain', l.get('platform', ''))
    print(f'  Score: {l["score"]:5.1f} | {l["intent"]:15s} | {domain}')
    print(f'         {l.get("title", "")[:60]}')

Step 4: Export the lead pipeline output

Export scored leads to CSV for CRM import. Include intent type, signal strength, and source URL for sales context.

Python
import csv

def export_leads(leads: list, filename: str = 'intent_leads.csv'):
    if not leads:
        print('No leads to export')
        return
    fieldnames = ['score', 'type', 'domain', 'intent', 'signal', 'title', 'url']
    with open(filename, 'w', newline='') as f:
        writer = csv.DictWriter(f, fieldnames=fieldnames, extrasaction='ignore')
        writer.writeheader()
        writer.writerows(leads)
    high = sum(1 for l in leads if l['score'] >= 10)
    medium = sum(1 for l in leads if 5 <= l['score'] < 10)
    low = sum(1 for l in leads if l['score'] < 5)
    print(f'Exported {len(leads)} leads to {filename}')
    print(f'  High intent (10+): {high}')
    print(f'  Medium intent (5-9): {medium}')
    print(f'  Low intent (<5): {low}')
    print(f'  API cost: {len(queries)} searches x $0.005 = ${len(queries) * 0.005:.3f}')

export_leads(unique)
print(f'\nVolume lead gen: scrapes 1000 emails, 2% conversion = 20 leads')
print(f'Intent pipeline: finds {len([l for l in unique if l["score"] >= 7])} high-intent signals')

Python Example

Python
import os, requests, csv, time

SCAVIO_KEY = os.environ['SCAVIO_API_KEY']
H = {'x-api-key': SCAVIO_KEY, 'Content-Type': 'application/json'}

def find_intent_leads(product, competitors):
    leads = []
    queries = [f'{c} alternative 2026' for c in competitors] + [f'best {product} 2026']
    for q in queries:
        resp = requests.post('https://api.scavio.dev/api/v1/search', headers=H,
            json={'query': q, 'country_code': 'us', 'num_results': 5})
        for r in resp.json().get('organic_results', []):
            leads.append({'query': q, 'title': r['title'], 'url': r['link']})
        time.sleep(0.3)
    print(f'Found {len(leads)} intent signals from {len(queries)} queries')
    print(f'Cost: ${len(queries) * 0.005:.3f}')
    return leads

find_intent_leads('search API', ['SerpAPI', 'Tavily'])

JavaScript Example

JavaScript
const SCAVIO_KEY = process.env.SCAVIO_API_KEY;

async function findIntentLeads(product, competitors) {
  const queries = [...competitors.map(c => `${c} alternative 2026`), `best ${product} 2026`];
  const leads = [];
  for (const q of queries) {
    const resp = await fetch('https://api.scavio.dev/api/v1/search', {
      method: 'POST',
      headers: { 'x-api-key': SCAVIO_KEY, 'Content-Type': 'application/json' },
      body: JSON.stringify({ query: q, country_code: 'us', num_results: 5 })
    });
    for (const r of ((await resp.json()).organic_results || [])) {
      leads.push({ query: q, title: r.title, url: r.link });
    }
  }
  console.log(`Found ${leads.length} intent signals from ${queries.length} queries`);
  return leads;
}

findIntentLeads('search API', ['SerpAPI', 'Tavily']);

Expected Output

JSON
Found 18 intent signals
Unique leads: 12

Top intent signals:
  Score:  15.0 | switching        | blog.example.com
         Why We Switched From SerpAPI to a Cheaper Alternati
  Score:  13.5 | price_sensitive   | startup.io
         SerpAPI Pricing: Is It Worth $25/month in 2026?
  Score:  12.0 | pain_point        | reddit
         r/webdev - search API pricing is getting ridiculous

Exported 12 leads to intent_leads.csv
  High intent (10+): 4
  Medium intent (5-9): 5
  Low intent (<5): 3
  API cost: 9 searches x $0.005 = $0.045

Related Tutorials

Frequently Asked Questions

Most developers complete this tutorial in 15 to 30 minutes. You will need a Scavio API key (free tier works) and a working Python or JavaScript environment.

Python 3.9+ installed. requests library installed. A Scavio API key from scavio.dev. A clear ICP (ideal customer profile) for targeting. A Scavio API key gives you 250 free credits per month.

Yes. The free tier includes 250 credits per month, which is more than enough to complete this tutorial and prototype a working solution.

Scavio has a native LangChain package (langchain-scavio), an MCP server, and a plain REST API that works with any HTTP client. This tutorial uses the raw REST API, but you can adapt to your framework of choice.

Start Building

Build a lead generation pipeline that finds prospects showing buying intent instead of scraping volume lists. Search-powered intent signals for B2B sales.