Tutorial

How to Replace ScrapingAnt with a Search API

Swap ScrapingAnt for a structured search API on search-indexed data. Keep ScrapingAnt for non-indexed pages. Step-by-step migration guide.

Replace ScrapingAnt with a search API for any query that targets search-indexed data like rankings, snippets, prices, and reviews. ScrapingAnt excels at rendering JavaScript-heavy pages, but using it to scrape Google results or Amazon listings means paying proxy and render costs for data that a search API returns as structured JSON. Audit your ScrapingAnt usage, identify which calls target search-indexed content, swap those to Scavio, and keep ScrapingAnt only for pages that require full browser rendering. Most teams find 40-60% of ScrapingAnt calls can move to a search API at lower cost with better structure.

Prerequisites

  • Python 3.8+ installed
  • requests library installed
  • A Scavio API key from scavio.dev
  • Access to your ScrapingAnt usage logs

Walkthrough

Step 1: Audit ScrapingAnt queries

Categorize your ScrapingAnt calls into search-indexed (can migrate) and rendering-dependent (keep ScrapingAnt) buckets.

Python
import os, json

API_KEY = os.environ['SCAVIO_API_KEY']

# Categorize your ScrapingAnt usage
SEARCH_INDEXED = [
    'google.com/search',
    'amazon.com/s?',
    'youtube.com/results',
    'walmart.com/search',
    'reddit.com/search',
]

def audit_calls(call_log: list) -> dict:
    migratable, keep = [], []
    for call in call_log:
        url = call.get('url', '')
        if any(pattern in url for pattern in SEARCH_INDEXED):
            migratable.append(call)
        else:
            keep.append(call)
    print(f'Total: {len(call_log)}, Migratable: {len(migratable)}, Keep ScrapingAnt: {len(keep)}')
    return {'migratable': migratable, 'keep': keep}

audit_calls([{'url': 'https://google.com/search?q=test'}, {'url': 'https://app.example.com/dashboard'}])

Step 2: Identify search-indexed queries

Extract the actual search queries from ScrapingAnt URLs and map them to platform-specific API calls.

Python
from urllib.parse import urlparse, parse_qs
import requests

def extract_search_query(url: str) -> dict:
    parsed = urlparse(url)
    host = parsed.netloc.replace('www.', '')
    params = parse_qs(parsed.query)
    if 'google.com' in host:
        return {'platform': 'google', 'query': params.get('q', [''])[0]}
    elif 'amazon.com' in host:
        return {'platform': 'amazon', 'query': params.get('k', [''])[0]}
    elif 'youtube.com' in host:
        return {'platform': 'youtube', 'query': params.get('search_query', [''])[0]}
    elif 'walmart.com' in host:
        return {'platform': 'walmart', 'query': params.get('q', [''])[0]}
    return {'platform': 'google', 'query': ''}

print(extract_search_query('https://www.google.com/search?q=best+crm+2026'))
print(extract_search_query('https://www.amazon.com/s?k=wireless+earbuds'))

Step 3: Swap search-indexed calls to API

Replace ScrapingAnt HTTP calls with Scavio API calls for all search-indexed queries.

Python
def scavio_search(platform: str, query: str) -> list:
    resp = requests.post('https://api.scavio.dev/api/v1/search',
        headers={'x-api-key': API_KEY},
        json={'platform': platform, 'query': query}, timeout=10)
    resp.raise_for_status()
    return resp.json().get('organic_results', [])

def migrated_search(url: str) -> list:
    """Drop-in replacement for ScrapingAnt search calls."""
    params = extract_search_query(url)
    if not params['query']:
        return []
    results = scavio_search(params['platform'], params['query'])
    print(f'{params["platform"]}: {len(results)} results for "{params["query"][:40]}"')
    return results

migrated_search('https://www.google.com/search?q=best+crm+2026')

Step 4: Keep ScrapingAnt for rendering-heavy pages

Route non-search traffic through ScrapingAnt while search traffic goes through Scavio. Use a router function.

Python
def smart_fetch(url: str) -> dict:
    """Route search-indexed URLs to Scavio, everything else to ScrapingAnt."""
    parsed = urlparse(url)
    host = parsed.netloc.replace('www.', '')
    search_hosts = ['google.com', 'amazon.com', 'youtube.com', 'walmart.com', 'reddit.com']
    if any(h in host for h in search_hosts) and '/search' in parsed.path or '/s?' in url or '/results' in parsed.path:
        results = migrated_search(url)
        return {'source': 'scavio', 'results': results}
    else:
        # Keep ScrapingAnt for JS-heavy pages
        # resp = requests.get('https://api.scrapingant.com/v2/general', params={'url': url, ...})
        return {'source': 'scrapingant', 'note': 'JS rendering required'}

print(smart_fetch('https://www.google.com/search?q=test'))
print(smart_fetch('https://app.example.com/dashboard'))

Python Example

Python
import requests, os
H = {'x-api-key': os.environ['SCAVIO_API_KEY']}

def search(platform, query):
    data = requests.post('https://api.scavio.dev/api/v1/search', headers=H,
        json={'platform': platform, 'query': query}).json()
    return data.get('organic_results', [])

# Replaces: ScrapingAnt call to google.com/search?q=...
results = search('google', 'best crm 2026')
print(f'{len(results)} structured results')

JavaScript Example

JavaScript
const H = {'x-api-key': process.env.SCAVIO_API_KEY, 'Content-Type': 'application/json'};
async function search(platform, query) {
  const r = await fetch('https://api.scavio.dev/api/v1/search', {
    method: 'POST', headers: H, body: JSON.stringify({platform, query})
  });
  return (await r.json()).organic_results || [];
}
// Replaces: ScrapingAnt call to google.com/search?q=...
search('google', 'best crm 2026').then(r => console.log(r.length + ' results'));

Expected Output

JSON
A hybrid pipeline where search-indexed queries route to Scavio for structured JSON and rendering-heavy pages stay on ScrapingAnt, reducing cost and improving data quality.

Related Tutorials

Frequently Asked Questions

Most developers complete this tutorial in 15 to 30 minutes. You will need a Scavio API key (free tier works) and a working Python or JavaScript environment.

Python 3.8+ installed. requests library installed. A Scavio API key from scavio.dev. Access to your ScrapingAnt usage logs. A Scavio API key gives you 250 free credits per month.

Yes. The free tier includes 250 credits per month, which is more than enough to complete this tutorial and prototype a working solution.

Scavio has a native LangChain package (langchain-scavio), an MCP server, and a plain REST API that works with any HTTP client. This tutorial uses the raw REST API, but you can adapt to your framework of choice.

Start Building

Swap ScrapingAnt for a structured search API on search-indexed data. Keep ScrapingAnt for non-indexed pages. Step-by-step migration guide.