Workflow

Multi-Source Enrichment Daily

Enrich new CRM leads daily with SERP data. Fall back from Apollo to Google search for missing company and contact details.

Overview

This workflow enriches new CRM leads every morning by first checking primary data sources and then falling back to Scavio SERP searches for any missing fields. For each new lead, it searches Google for the company name to gather firmographic data, website details, and recent news. The SERP-based enrichment catches leads that traditional databases miss because it uses live search data rather than pre-compiled records.

Trigger

Cron schedule (daily at 9:00 AM UTC)

Schedule

Runs daily at 9:00 AM UTC

Workflow Steps

1

Pull new leads from CRM

Query the CRM API for leads added in the last 24 hours that have incomplete profiles.

2

Attempt primary enrichment

Try enriching each lead from the primary data provider (e.g., Apollo, Clearbit) for standard fields.

3

SERP fallback for missing data

For leads with missing fields, search Scavio Google for the company name to find website, description, and context.

4

Validate and merge data

Merge enriched data from all sources, validate email formats, and flag low-confidence records.

5

Update CRM records

Push the enriched data back to the CRM and log the enrichment results for reporting.

Python Implementation

Python
import requests
import json
from pathlib import Path
from datetime import datetime

API_KEY = "your_scavio_api_key"

# Simulated new leads from CRM
NEW_LEADS = [
    {"id": "lead_001", "company": "Acme Corp", "domain": None, "industry": None},
    {"id": "lead_002", "company": "TechStart Inc", "domain": None, "industry": None},
    {"id": "lead_003", "company": "DataFlow Systems", "domain": None, "industry": None},
]

def enrich_via_serp(company_name: str) -> dict:
    res = requests.post(
        "https://api.scavio.dev/api/v1/search",
        headers={"x-api-key": API_KEY},
        json={"platform": "google", "query": f"{company_name} company"},
        timeout=15,
    )
    res.raise_for_status()
    data = res.json()

    enriched = {"source": "serp"}
    organic = data.get("organic", [])
    if organic:
        top = organic[0]
        link = top.get("link", "")
        if link:
            enriched["domain"] = link.split("/")[2] if "/" in link else ""
        enriched["description"] = top.get("snippet", "")

    # Check knowledge graph for structured company data
    kg = data.get("knowledge_graph") or {}
    if kg:
        enriched["kg_title"] = kg.get("title", "")
        enriched["kg_type"] = kg.get("type", "")
        enriched["kg_description"] = kg.get("description", "")

    return enriched

def run():
    date = datetime.utcnow().strftime("%Y-%m-%d")
    enriched_leads = []

    for lead in NEW_LEADS:
        serp_data = enrich_via_serp(lead["company"])
        enriched = {**lead, **serp_data, "enriched_at": date}
        enriched_leads.append(enriched)

    output = {"date": date, "leads_processed": len(NEW_LEADS), "leads": enriched_leads}
    Path(f"enriched_leads_{date}.json").write_text(json.dumps(output, indent=2))

    enriched_count = sum(1 for l in enriched_leads if l.get("domain"))
    print(f"Lead enrichment {date}: {enriched_count}/{len(NEW_LEADS)} leads enriched with domain data")
    for l in enriched_leads:
        print(f"  {l['company']}: {l.get('domain', 'no domain found')}")

if __name__ == "__main__":
    run()

JavaScript Implementation

JavaScript
const API_KEY = "your_scavio_api_key";
const LEADS = [
  { id: "lead_001", company: "Acme Corp" },
  { id: "lead_002", company: "TechStart Inc" },
];

async function enrichViaSERP(company) {
  const res = await fetch("https://api.scavio.dev/api/v1/search", {
    method: "POST",
    headers: { "x-api-key": API_KEY, "content-type": "application/json" },
    body: JSON.stringify({ platform: "google", query: `${company} company` }),
  });
  if (!res.ok) throw new Error(`scavio ${res.status}`);
  const data = await res.json();
  const top = (data.organic ?? [])[0];
  return {
    domain: top?.link ? new URL(top.link).hostname : null,
    description: top?.snippet ?? "",
    kgTitle: data.knowledge_graph?.title ?? "",
  };
}

for (const lead of LEADS) {
  const enriched = await enrichViaSERP(lead.company);
  console.log(`${lead.company}: ${enriched.domain ?? "no domain"}`);
}

Platforms Used

Google

Web search with knowledge graph, PAA, and AI overviews

Frequently Asked Questions

This workflow enriches new CRM leads every morning by first checking primary data sources and then falling back to Scavio SERP searches for any missing fields. For each new lead, it searches Google for the company name to gather firmographic data, website details, and recent news. The SERP-based enrichment catches leads that traditional databases miss because it uses live search data rather than pre-compiled records.

This workflow uses a cron schedule (daily at 9:00 am utc). Runs daily at 9:00 AM UTC.

This workflow uses the following Scavio platforms: google. Each platform is called via the same unified API endpoint.

Yes. Scavio's free tier includes 250 credits per month with no credit card required. That is enough to test and validate this workflow before scaling it.

Multi-Source Enrichment Daily

Enrich new CRM leads daily with SERP data. Fall back from Apollo to Google search for missing company and contact details.