Multi-Source Business Enrichment Playbook
Layer Apollo, Clearbit, and Scavio for business enrichment. Waterfall pattern: try primary, fall back to SERP data, validate. 40% to 75% match rate.
No single data source gives you a complete business profile. Apollo covers contacts and company basics. SERP APIs reveal web presence, tech stack signals, and recent news. Social APIs surface engagement and brand sentiment. The enrichment playbook layers these sources in a waterfall pattern: try the primary source, fall back to secondary, merge results.
The three-layer enrichment stack
- Layer 1 - Contact data: Apollo, Clearbit, or Hunter for email, title, company info
- Layer 2 - Web presence: SERP API for Google results, site authority signals, recent mentions
- Layer 3 - Social signals: TikTok/YouTube/Reddit presence, engagement metrics, sentiment
Waterfall enrichment pattern
import requests, os
SCAVIO_H = {"x-api-key": os.environ["SCAVIO_API_KEY"]}
def enrich_company(company_name: str, domain: str):
"""Multi-source enrichment with waterfall fallback."""
profile = {"company": company_name, "domain": domain}
# Layer 1: web presence via Google
google = requests.post("https://api.scavio.dev/api/v1/search",
headers=SCAVIO_H,
json={"query": f"{company_name} {domain}", "platform": "google"})
google_results = google.json().get("organic_results", [])
profile["web_mentions"] = len(google_results)
profile["top_pages"] = [r.get("link") for r in google_results[:5]]
# Layer 2: social presence via Reddit
reddit = requests.post("https://api.scavio.dev/api/v1/search",
headers=SCAVIO_H,
json={"query": f"{company_name} reddit", "platform": "reddit"})
reddit_results = reddit.json().get("organic_results", [])
profile["reddit_mentions"] = len(reddit_results)
profile["reddit_sentiment_sample"] = [
r.get("snippet", "")[:100] for r in reddit_results[:3]]
# Layer 3: YouTube presence
youtube = requests.post("https://api.scavio.dev/api/v1/search",
headers=SCAVIO_H,
json={"query": company_name, "platform": "youtube"})
youtube_results = youtube.json().get("organic_results", [])
profile["youtube_videos"] = len(youtube_results)
return profile
# 3 API calls per company x $0.005 = $0.015 per enrichment
company = enrich_company("Linear", "linear.app")
print(f"Web mentions: {company['web_mentions']}")
print(f"Reddit mentions: {company['reddit_mentions']}")
print(f"YouTube videos: {company['youtube_videos']}")Batch enrichment for lead lists
import time
def batch_enrich(companies: list, delay: float = 0.5):
"""Enrich a list of companies with rate limiting."""
enriched = []
for comp in companies:
try:
profile = enrich_company(comp["name"], comp["domain"])
enriched.append(profile)
except Exception as e:
enriched.append({"company": comp["name"], "error": str(e)})
time.sleep(delay)
return enriched
# 100 companies x 3 calls each x $0.005 = $1.50 total
leads = [
{"name": "Linear", "domain": "linear.app"},
{"name": "Notion", "domain": "notion.so"},
{"name": "Coda", "domain": "coda.io"},
]
results = batch_enrich(leads)Scoring enriched profiles
Once enriched, score each company on signals that matter for your use case. High web mentions + active Reddit discussions = established brand with community engagement. Many YouTube videos = content marketing investment. Low mentions across all channels = either early-stage or niche. The scoring model depends on your goal: sales prioritization, competitive analysis, or partnership targeting.
Cost at scale
- 100 companies/day: 300 API calls = $1.50/day = $45/mo
- 500 companies/day: 1,500 API calls = $7.50/day = $225/mo
- Compare: Apollo + Clearbit alone can cost $200-500/mo for similar enrichment volumes
- The SERP/social layers complement contact databases -- they do not replace them