Tutorial

How to Build a Sales Prospecting Pipeline with a Search API

Build an automated sales prospecting pipeline that finds, qualifies, and enriches prospects using search API data. Full Python tutorial.

A sales prospecting pipeline automates the process of finding potential customers, qualifying them based on signals like company size and tech stack, and enriching their profiles for outreach. Traditional prospecting tools like Apollo ($49+/user/month) or ZoomInfo provide pre-built databases, but they are expensive and often outdated. Using a search API, you can build a real-time prospecting pipeline that finds fresh leads by searching for buying signals. This tutorial builds the full pipeline at $0.005 per search via Scavio.

Prerequisites

  • Python 3.9+ installed
  • requests library installed
  • A Scavio API key from scavio.dev
  • A target customer profile (industry, size, tech stack)

Walkthrough

Step 1: Define your ideal customer profile searches

Create search queries that target companies matching your ICP. Use buying signals like job postings, technology mentions, and funding announcements.

Python
ICP_SEARCHES = [
    # Companies hiring for roles that signal they need your product
    '"hiring" AND "head of marketing" AND "series A" 2026',
    # Companies using competitor products (potential switchers)
    '"switching from HubSpot" OR "HubSpot alternative" 2026',
    # Companies in target verticals with growth signals
    'SaaS startup raised funding 2026 marketing automation',
    # Job boards as lead source
    'site:linkedin.com/jobs marketing director saas startup',
]

for i, q in enumerate(ICP_SEARCHES, 1):
    print(f'{i}. {q}')

Step 2: Search and extract prospect companies

Run each ICP search and extract company names and URLs from the results. Deduplicate across searches.

Python
import requests, os, re

API_KEY = os.environ['SCAVIO_API_KEY']

def search(query: str) -> list:
    resp = requests.post('https://api.scavio.dev/api/v1/search',
        headers={'x-api-key': API_KEY, 'Content-Type': 'application/json'},
        json={'query': query, 'country_code': 'us'})
    return resp.json().get('organic_results', [])

def extract_prospects(queries: list) -> list:
    seen_domains = set()
    prospects = []
    for query in queries:
        results = search(query)
        for r in results:
            from urllib.parse import urlparse
            domain = urlparse(r.get('link', '')).netloc.replace('www.', '')
            if domain and domain not in seen_domains:
                seen_domains.add(domain)
                prospects.append({
                    'domain': domain,
                    'title': r['title'],
                    'snippet': r.get('snippet', ''),
                    'source_query': query[:50]
                })
    return prospects

prospects = extract_prospects(ICP_SEARCHES)
print(f'Found {len(prospects)} unique prospects')

Step 3: Qualify prospects with scoring

Score each prospect based on signals found in their search snippets. Higher scores indicate better fit with your ICP.

Python
SCORING_RULES = {
    'series a': 10, 'series b': 15, 'raised': 10,
    'hiring': 8, 'growing': 5, 'saas': 5,
    'marketing': 5, 'automation': 5, 'startup': 3,
    '2026': 3, 'remote': 2
}

def score_prospect(prospect: dict) -> int:
    text = f'{prospect["title"]} {prospect["snippet"]}'.lower()
    score = sum(points for keyword, points in SCORING_RULES.items() if keyword in text)
    return score

def qualify_prospects(prospects: list, min_score: int = 10) -> list:
    scored = []
    for p in prospects:
        p['score'] = score_prospect(p)
        scored.append(p)
    qualified = sorted([p for p in scored if p['score'] >= min_score],
                       key=lambda x: x['score'], reverse=True)
    print(f'Qualified: {len(qualified)}/{len(prospects)} (score >= {min_score})')
    return qualified

qualified = qualify_prospects(prospects)
for p in qualified[:5]:
    print(f'  [{p["score"]}] {p["domain"]}: {p["title"][:50]}')

Step 4: Enrich qualified prospects

For qualified prospects only, run additional searches to find contact pages, team pages, and social profiles. This targeted enrichment keeps costs low.

Python
import time

def enrich_prospect(prospect: dict) -> dict:
    domain = prospect['domain']
    # Find contact/team page
    contact_results = search(f'site:{domain} contact OR team OR about')
    contact_urls = [r['link'] for r in contact_results[:3]]
    # Find social profiles
    social_results = search(f'{domain} linkedin OR twitter company')
    social_links = [r['link'] for r in social_results
                    if 'linkedin.com' in r.get('link', '') or 'twitter.com' in r.get('link', '')]
    prospect['contact_pages'] = contact_urls
    prospect['social_profiles'] = social_links[:3]
    return prospect

def enrich_batch(prospects: list) -> list:
    enriched = []
    for i, p in enumerate(prospects):
        enriched.append(enrich_prospect(p))
        if (i + 1) % 5 == 0:
            print(f'  Enriched {i + 1}/{len(prospects)}')
        time.sleep(0.3)
    return enriched

enriched = enrich_batch(qualified[:20])  # top 20 only
print(f'Enriched {len(enriched)} prospects')

Step 5: Export the pipeline output

Save the qualified, enriched prospects to CSV sorted by score. Include cost tracking so you know the ROI of each prospecting run.

Python
import csv

def export_prospects(prospects: list, output: str = 'prospects.csv') -> None:
    if not prospects:
        return
    fields = ['score', 'domain', 'title', 'snippet', 'contact_pages', 'social_profiles']
    with open(output, 'w', newline='') as f:
        writer = csv.DictWriter(f, fieldnames=fields, extrasaction='ignore')
        writer.writeheader()
        for p in prospects:
            row = {**p}
            row['contact_pages'] = ' | '.join(p.get('contact_pages', []))
            row['social_profiles'] = ' | '.join(p.get('social_profiles', []))
            writer.writerow(row)
    # Cost calculation:
    # ICP searches: 4 queries = $0.02
    # Enrichment: 20 prospects x 2 searches = $0.20
    total_cost = (len(ICP_SEARCHES) + len(prospects) * 2) * 0.005
    print(f'Saved {len(prospects)} prospects to {output}')
    print(f'Pipeline cost: ${total_cost:.2f}')
    print(f'Cost per qualified lead: ${total_cost / max(len(prospects), 1):.3f}')

export_prospects(enriched)

Python Example

Python
import os, requests, time

API_KEY = os.environ['SCAVIO_API_KEY']

def search(query):
    resp = requests.post('https://api.scavio.dev/api/v1/search',
        headers={'x-api-key': API_KEY, 'Content-Type': 'application/json'},
        json={'query': query, 'country_code': 'us'})
    return resp.json().get('organic_results', [])

def prospect(queries):
    prospects = []
    for q in queries:
        for r in search(q):
            prospects.append({'title': r['title'], 'url': r['link'], 'snippet': r.get('snippet', '')})
        time.sleep(0.3)
    return prospects

queries = ['SaaS startup hiring marketing 2026', 'series A marketing automation']
results = prospect(queries)
print(f'Found {len(results)} prospects at ${len(queries) * 0.005:.3f}')
for r in results[:5]:
    print(f'  {r["title"][:60]}')

JavaScript Example

JavaScript
const API_KEY = process.env.SCAVIO_API_KEY;

async function search(query) {
  const resp = await fetch('https://api.scavio.dev/api/v1/search', {
    method: 'POST',
    headers: { 'x-api-key': API_KEY, 'Content-Type': 'application/json' },
    body: JSON.stringify({ query, country_code: 'us' })
  });
  return (await resp.json()).organic_results || [];
}

async function main() {
  const queries = ['SaaS startup hiring marketing 2026', 'series A marketing automation'];
  const prospects = [];
  for (const q of queries) {
    const results = await search(q);
    results.forEach(r => prospects.push({ title: r.title, url: r.link }));
  }
  console.log(`Found ${prospects.length} prospects`);
  prospects.slice(0, 5).forEach(p => console.log(`  ${p.title.slice(0, 60)}`));
}

main().catch(console.error);

Expected Output

JSON
Found 38 unique prospects
Qualified: 15/38 (score >= 10)
  [25] acmesaas.com: Acme SaaS Raises $12M Series A for Marketing...
  [20] growthco.io: GrowthCo Hiring Head of Marketing, Series B...
  [18] betastart.com: BetaStart Switches from HubSpot to...
  Enriched 5/15
  Enriched 10/15
  Enriched 15/15
Saved 15 prospects to prospects.csv
Pipeline cost: $0.17
Cost per qualified lead: $0.011

Related Tutorials

Frequently Asked Questions

Most developers complete this tutorial in 15 to 30 minutes. You will need a Scavio API key (free tier works) and a working Python or JavaScript environment.

Python 3.9+ installed. requests library installed. A Scavio API key from scavio.dev. A target customer profile (industry, size, tech stack). A Scavio API key gives you 250 free credits per month.

Yes. The free tier includes 250 credits per month, which is more than enough to complete this tutorial and prototype a working solution.

Scavio has a native LangChain package (langchain-scavio), an MCP server, and a plain REST API that works with any HTTP client. This tutorial uses the raw REST API, but you can adapt to your framework of choice.

Start Building

Build an automated sales prospecting pipeline that finds, qualifies, and enriches prospects using search API data. Full Python tutorial.