How is this workflow triggered?

This workflow uses a cron schedule (daily at 9:00 am utc). Runs daily at 9:00 AM UTC.

Which Scavio platforms does this workflow use?

This workflow uses the following Scavio platforms: google. Each platform is called via the same unified API endpoint.

Can I run this workflow on the free tier?

Yes. Scavio's free tier includes 50 credits on signup with no credit card required. That is enough to test and validate this workflow before scaling it.

Daily Lead Enrichment Pipeline Workflow

Overview

This workflow enriches new CRM leads every morning by first checking primary data sources and then falling back to Scavio SERP searches for any missing fields. For each new lead, it searches Google for the company name to gather firmographic data, website details, and recent news. The SERP-based enrichment catches leads that traditional databases miss because it uses live search data rather than pre-compiled records.

Trigger

Cron schedule (daily at 9:00 AM UTC)

Schedule

Runs daily at 9:00 AM UTC

Workflow Steps

Pull new leads from CRM

Query the CRM API for leads added in the last 24 hours that have incomplete profiles.

Attempt primary enrichment

Try enriching each lead from the primary data provider (e.g., Apollo, Clearbit) for standard fields.

SERP fallback for missing data

For leads with missing fields, search Scavio Google for the company name to find website, description, and context.

Validate and merge data

Merge enriched data from all sources, validate email formats, and flag low-confidence records.

Update CRM records

Push the enriched data back to the CRM and log the enrichment results for reporting.

Python Implementation

Python

import requests
import json
from pathlib import Path
from datetime import datetime

API_KEY = "your_scavio_api_key"

# Simulated new leads from CRM
NEW_LEADS = [
    {"id": "lead_001", "company": "Acme Corp", "domain": None, "industry": None},
    {"id": "lead_002", "company": "TechStart Inc", "domain": None, "industry": None},
    {"id": "lead_003", "company": "DataFlow Systems", "domain": None, "industry": None},
]

def enrich_via_serp(company_name: str) -> dict:
    res = requests.post(
        "https://api.scavio.dev/api/v1/search",
        headers={"x-api-key": API_KEY},
        json={"platform": "google", "query": f"{company_name} company"},
        timeout=15,
    )
    res.raise_for_status()
    data = res.json()

    enriched = {"source": "serp"}
    organic = data.get("organic", [])
    if organic:
        top = organic[0]
        link = top.get("link", "")
        if link:
            enriched["domain"] = link.split("/")[2] if "/" in link else ""
        enriched["description"] = top.get("snippet", "")

    # Check knowledge graph for structured company data
    kg = data.get("knowledge_graph") or {}
    if kg:
        enriched["kg_title"] = kg.get("title", "")
        enriched["kg_type"] = kg.get("type", "")
        enriched["kg_description"] = kg.get("description", "")

    return enriched

def run():
    date = datetime.utcnow().strftime("%Y-%m-%d")
    enriched_leads = []

    for lead in NEW_LEADS:
        serp_data = enrich_via_serp(lead["company"])
        enriched = {**lead, **serp_data, "enriched_at": date}
        enriched_leads.append(enriched)

    output = {"date": date, "leads_processed": len(NEW_LEADS), "leads": enriched_leads}
    Path(f"enriched_leads_{date}.json").write_text(json.dumps(output, indent=2))

    enriched_count = sum(1 for l in enriched_leads if l.get("domain"))
    print(f"Lead enrichment {date}: {enriched_count}/{len(NEW_LEADS)} leads enriched with domain data")
    for l in enriched_leads:
        print(f"  {l['company']}: {l.get('domain', 'no domain found')}")

if __name__ == "__main__":
    run()

JavaScript Implementation

JavaScript

const API_KEY = "your_scavio_api_key";
const LEADS = [
  { id: "lead_001", company: "Acme Corp" },
  { id: "lead_002", company: "TechStart Inc" },
];

async function enrichViaSERP(company) {
  const res = await fetch("https://api.scavio.dev/api/v1/search", {
    method: "POST",
    headers: { "x-api-key": API_KEY, "content-type": "application/json" },
    body: JSON.stringify({ platform: "google", query: `${company} company` }),
  });
  if (!res.ok) throw new Error(`scavio ${res.status}`);
  const data = await res.json();
  const top = (data.organic ?? [])[0];
  return {
    domain: top?.link ? new URL(top.link).hostname : null,
    description: top?.snippet ?? "",
    kgTitle: data.knowledge_graph?.title ?? "",
  };
}

for (const lead of LEADS) {
  const enriched = await enrichViaSERP(lead.company);
  console.log(`${lead.company}: ${enriched.domain ?? "no domain"}`);
}

Platforms Used

Google

Web search with knowledge graph, PAA, and AI overviews

Overview

Trigger

Cron schedule (daily at 9:00 AM UTC)

Schedule

Runs daily at 9:00 AM UTC

Workflow Steps

Pull new leads from CRM

Query the CRM API for leads added in the last 24 hours that have incomplete profiles.

Attempt primary enrichment

Try enriching each lead from the primary data provider (e.g., Apollo, Clearbit) for standard fields.

SERP fallback for missing data

For leads with missing fields, search Scavio Google for the company name to find website, description, and context.

Validate and merge data

Merge enriched data from all sources, validate email formats, and flag low-confidence records.

Update CRM records

Push the enriched data back to the CRM and log the enrichment results for reporting.

Python Implementation

Python

import requests
import json
from pathlib import Path
from datetime import datetime

API_KEY = "your_scavio_api_key"

# Simulated new leads from CRM
NEW_LEADS = [
    {"id": "lead_001", "company": "Acme Corp", "domain": None, "industry": None},
    {"id": "lead_002", "company": "TechStart Inc", "domain": None, "industry": None},
    {"id": "lead_003", "company": "DataFlow Systems", "domain": None, "industry": None},
]

def enrich_via_serp(company_name: str) -> dict:
    res = requests.post(
        "https://api.scavio.dev/api/v1/search",
        headers={"x-api-key": API_KEY},
        json={"platform": "google", "query": f"{company_name} company"},
        timeout=15,
    )
    res.raise_for_status()
    data = res.json()

    enriched = {"source": "serp"}
    organic = data.get("organic", [])
    if organic:
        top = organic[0]
        link = top.get("link", "")
        if link:
            enriched["domain"] = link.split("/")[2] if "/" in link else ""
        enriched["description"] = top.get("snippet", "")

    # Check knowledge graph for structured company data
    kg = data.get("knowledge_graph") or {}
    if kg:
        enriched["kg_title"] = kg.get("title", "")
        enriched["kg_type"] = kg.get("type", "")
        enriched["kg_description"] = kg.get("description", "")

    return enriched

def run():
    date = datetime.utcnow().strftime("%Y-%m-%d")
    enriched_leads = []

    for lead in NEW_LEADS:
        serp_data = enrich_via_serp(lead["company"])
        enriched = {**lead, **serp_data, "enriched_at": date}
        enriched_leads.append(enriched)

    output = {"date": date, "leads_processed": len(NEW_LEADS), "leads": enriched_leads}
    Path(f"enriched_leads_{date}.json").write_text(json.dumps(output, indent=2))

    enriched_count = sum(1 for l in enriched_leads if l.get("domain"))
    print(f"Lead enrichment {date}: {enriched_count}/{len(NEW_LEADS)} leads enriched with domain data")
    for l in enriched_leads:
        print(f"  {l['company']}: {l.get('domain', 'no domain found')}")

if __name__ == "__main__":
    run()

JavaScript Implementation

JavaScript

const API_KEY = "your_scavio_api_key";
const LEADS = [
  { id: "lead_001", company: "Acme Corp" },
  { id: "lead_002", company: "TechStart Inc" },
];

async function enrichViaSERP(company) {
  const res = await fetch("https://api.scavio.dev/api/v1/search", {
    method: "POST",
    headers: { "x-api-key": API_KEY, "content-type": "application/json" },
    body: JSON.stringify({ platform: "google", query: `${company} company` }),
  });
  if (!res.ok) throw new Error(`scavio ${res.status}`);
  const data = await res.json();
  const top = (data.organic ?? [])[0];
  return {
    domain: top?.link ? new URL(top.link).hostname : null,
    description: top?.snippet ?? "",
    kgTitle: data.knowledge_graph?.title ?? "",
  };
}

for (const lead of LEADS) {
  const enriched = await enrichViaSERP(lead.company);
  console.log(`${lead.company}: ${enriched.domain ?? "no domain"}`);
}

Platforms Used

Google

Web search with knowledge graph, PAA, and AI overviews

Multi-Source Enrichment Daily

Overview

Trigger

Schedule

Workflow Steps

Pull new leads from CRM

Attempt primary enrichment

SERP fallback for missing data

Validate and merge data

Update CRM records

Python Implementation

JavaScript Implementation

Platforms Used

Google

Frequently Asked Questions

What does the Multi-Source Enrichment Daily workflow do?

How is this workflow triggered?

Which Scavio platforms does this workflow use?

Can I run this workflow on the free tier?

Multi-Source Enrichment Daily

Multi-Source Enrichment Daily

Overview

Trigger

Schedule

Workflow Steps

Pull new leads from CRM

Attempt primary enrichment

SERP fallback for missing data

Validate and merge data

Update CRM records

Python Implementation

JavaScript Implementation

Platforms Used

Google

Frequently Asked Questions

What does the Multi-Source Enrichment Daily workflow do?

How is this workflow triggered?

Which Scavio platforms does this workflow use?

Can I run this workflow on the free tier?

Multi-Source Enrichment Daily