Tutorial

How to Build a B2B Company Discovery Search Layer

Build a search layer that discovers B2B companies matching your ICP using the Scavio API. Enriches results with domain, tech stack, and funding data.

Build a B2B company discovery search layer by combining industry-specific queries with structured result parsing to identify companies matching your ideal customer profile. Instead of relying on static databases that go stale within months, this approach uses live search data to discover new companies as they appear online. Scavio's Google search endpoint returns fresh results that you can filter by domain authority, content signals, and tech indicators. This tutorial builds a discovery pipeline that takes ICP criteria and returns enriched company profiles.

Prerequisites

  • Python 3.8+ installed
  • requests library installed
  • A Scavio API key from scavio.dev
  • A defined ICP (ideal customer profile) with industry and size criteria

Walkthrough

Step 1: Define ICP-based search queries

Generate search queries from your ICP criteria that are likely to surface company websites, job posts, and press mentions.

Python
import os, requests

API_KEY = os.environ['SCAVIO_API_KEY']

def generate_discovery_queries(industry: str, signals: list) -> list:
    templates = [
        f'{industry} companies hiring 2026',
        f'{industry} startups series a 2026',
        f'best {industry} tools for enterprise',
        f'{industry} software company reviews',
    ]
    for signal in signals:
        templates.append(f'{industry} {signal}')
    return templates

queries = generate_discovery_queries('martech', ['raised funding', 'product launch', 'api integration'])
print(queries)

Step 2: Search and extract company domains

Run each query through Scavio and extract unique company domains from the results.

Python
from urllib.parse import urlparse

def search_companies(query: str) -> list:
    resp = requests.post('https://api.scavio.dev/api/v1/search',
        headers={'x-api-key': API_KEY},
        json={'platform': 'google', 'query': query}, timeout=15)
    resp.raise_for_status()
    results = resp.json().get('organic_results', [])
    domains = set()
    for r in results:
        domain = urlparse(r.get('link', '')).netloc.replace('www.', '')
        if domain and not any(skip in domain for skip in ['google.', 'youtube.', 'linkedin.', 'reddit.', 'wikipedia.']):
            domains.add(domain)
    return list(domains)

def discover_companies(queries: list) -> list:
    all_domains = set()
    for q in queries:
        all_domains.update(search_companies(q))
    return sorted(all_domains)

Step 3: Enrich each company with a follow-up search

For each discovered domain, run a targeted search to extract company description, founding info, and technology signals.

Python
def enrich_company(domain: str) -> dict:
    resp = requests.post('https://api.scavio.dev/api/v1/search',
        headers={'x-api-key': API_KEY},
        json={'platform': 'google', 'query': f'{domain} company about funding'}, timeout=15)
    results = resp.json().get('organic_results', [])
    snippets = ' '.join(r.get('snippet', '') for r in results[:3])
    return {
        'domain': domain,
        'description': snippets[:300],
        'result_count': len(results),
    }

Step 4: Filter and rank by ICP fit

Score each company based on how well its enrichment data matches your ICP criteria, then output a ranked list.

Python
def score_icp_fit(company: dict, keywords: list) -> float:
    text = company.get('description', '').lower()
    matches = sum(1 for kw in keywords if kw.lower() in text)
    return round(matches / max(len(keywords), 1), 2)

def run_discovery(industry: str, signals: list, icp_keywords: list):
    queries = generate_discovery_queries(industry, signals)
    domains = discover_companies(queries)
    companies = [enrich_company(d) for d in domains[:20]]
    for c in companies:
        c['icp_score'] = score_icp_fit(c, icp_keywords)
    ranked = sorted(companies, key=lambda x: x['icp_score'], reverse=True)
    for c in ranked[:10]:
        print(f'{c["domain"]:<30} ICP={c["icp_score"]}')
    return ranked

run_discovery('martech', ['raised funding'], ['api', 'saas', 'enterprise', 'integration'])

Python Example

Python
import requests, os
from urllib.parse import urlparse
H = {'x-api-key': os.environ['SCAVIO_API_KEY']}

def discover(industry):
    resp = requests.post('https://api.scavio.dev/api/v1/search', headers=H,
        json={'platform': 'google', 'query': f'{industry} companies hiring 2026'})
    domains = set()
    for r in resp.json().get('organic_results', []):
        d = urlparse(r.get('link', '')).netloc.replace('www.', '')
        if d: domains.add(d)
    return sorted(domains)

for d in discover('martech')[:10]: print(d)

JavaScript Example

JavaScript
const H = {'x-api-key': process.env.SCAVIO_API_KEY, 'Content-Type': 'application/json'};
async function discover(industry) {
  const r = await fetch('https://api.scavio.dev/api/v1/search', {
    method: 'POST', headers: H,
    body: JSON.stringify({platform: 'google', query: `${industry} companies hiring 2026`})
  });
  const results = (await r.json()).organic_results || [];
  const domains = [...new Set(results.map(r => new URL(r.link).hostname.replace('www.', '')))];
  return domains.filter(d => !['google.com', 'linkedin.com'].includes(d));
}
discover('martech').then(d => d.slice(0, 10).forEach(x => console.log(x)));

Expected Output

JSON
A ranked list of B2B company domains discovered through live search, scored against ICP criteria with enrichment data from follow-up queries.

Related Tutorials

Frequently Asked Questions

Most developers complete this tutorial in 15 to 30 minutes. You will need a Scavio API key (free tier works) and a working Python or JavaScript environment.

Python 3.8+ installed. requests library installed. A Scavio API key from scavio.dev. A defined ICP (ideal customer profile) with industry and size criteria. A Scavio API key gives you 500 free credits per month.

Yes. The free tier includes 500 credits per month, which is more than enough to complete this tutorial and prototype a working solution.

Scavio has a native LangChain package (langchain-scavio), an MCP server, and a plain REST API that works with any HTTP client. This tutorial uses the raw REST API, but you can adapt to your framework of choice.

Start Building

Build a search layer that discovers B2B companies matching your ICP using the Scavio API. Enriches results with domain, tech stack, and funding data.