enrichmentb2bpipeline

Multi-Source Business Enrichment Playbook

Layer Apollo, Clearbit, and Scavio for business enrichment. Waterfall pattern: try primary, fall back to SERP data, validate. 40% to 75% match rate.

9 min

No single data source gives you a complete business profile. Apollo covers contacts and company basics. SERP APIs reveal web presence, tech stack signals, and recent news. Social APIs surface engagement and brand sentiment. The enrichment playbook layers these sources in a waterfall pattern: try the primary source, fall back to secondary, merge results.

The three-layer enrichment stack

  • Layer 1 - Contact data: Apollo, Clearbit, or Hunter for email, title, company info
  • Layer 2 - Web presence: SERP API for Google results, site authority signals, recent mentions
  • Layer 3 - Social signals: TikTok/YouTube/Reddit presence, engagement metrics, sentiment

Waterfall enrichment pattern

Python
import requests, os

SCAVIO_H = {"x-api-key": os.environ["SCAVIO_API_KEY"]}

def enrich_company(company_name: str, domain: str):
    """Multi-source enrichment with waterfall fallback."""
    profile = {"company": company_name, "domain": domain}

    # Layer 1: web presence via Google
    google = requests.post("https://api.scavio.dev/api/v1/search",
        headers=SCAVIO_H,
        json={"query": f"{company_name} {domain}", "platform": "google"})
    google_results = google.json().get("organic_results", [])
    profile["web_mentions"] = len(google_results)
    profile["top_pages"] = [r.get("link") for r in google_results[:5]]

    # Layer 2: social presence via Reddit
    reddit = requests.post("https://api.scavio.dev/api/v1/search",
        headers=SCAVIO_H,
        json={"query": f"{company_name} reddit", "platform": "reddit"})
    reddit_results = reddit.json().get("organic_results", [])
    profile["reddit_mentions"] = len(reddit_results)
    profile["reddit_sentiment_sample"] = [
        r.get("snippet", "")[:100] for r in reddit_results[:3]]

    # Layer 3: YouTube presence
    youtube = requests.post("https://api.scavio.dev/api/v1/search",
        headers=SCAVIO_H,
        json={"query": company_name, "platform": "youtube"})
    youtube_results = youtube.json().get("organic_results", [])
    profile["youtube_videos"] = len(youtube_results)

    return profile

# 3 API calls per company x $0.005 = $0.015 per enrichment
company = enrich_company("Linear", "linear.app")
print(f"Web mentions: {company['web_mentions']}")
print(f"Reddit mentions: {company['reddit_mentions']}")
print(f"YouTube videos: {company['youtube_videos']}")

Batch enrichment for lead lists

Python
import time

def batch_enrich(companies: list, delay: float = 0.5):
    """Enrich a list of companies with rate limiting."""
    enriched = []
    for comp in companies:
        try:
            profile = enrich_company(comp["name"], comp["domain"])
            enriched.append(profile)
        except Exception as e:
            enriched.append({"company": comp["name"], "error": str(e)})
        time.sleep(delay)
    return enriched

# 100 companies x 3 calls each x $0.005 = $1.50 total
leads = [
    {"name": "Linear", "domain": "linear.app"},
    {"name": "Notion", "domain": "notion.so"},
    {"name": "Coda", "domain": "coda.io"},
]
results = batch_enrich(leads)

Scoring enriched profiles

Once enriched, score each company on signals that matter for your use case. High web mentions + active Reddit discussions = established brand with community engagement. Many YouTube videos = content marketing investment. Low mentions across all channels = either early-stage or niche. The scoring model depends on your goal: sales prioritization, competitive analysis, or partnership targeting.

Cost at scale

  • 100 companies/day: 300 API calls = $1.50/day = $45/mo
  • 500 companies/day: 1,500 API calls = $7.50/day = $225/mo
  • Compare: Apollo + Clearbit alone can cost $200-500/mo for similar enrichment volumes
  • The SERP/social layers complement contact databases -- they do not replace them