How long does this migrate n8n scraping nodes to api tutorial take?

Most developers complete this tutorial in 15 to 30 minutes. You will need a Scavio API key (free tier works) and a working Python or JavaScript environment.

What do I need before starting?

n8n instance running. A Scavio API key from scavio.dev. Existing n8n workflows with HTTP scraping nodes. Basic n8n workflow knowledge. A Scavio API key gives you 50 free credits on signup.

Can I run this tutorial with the free tier?

Yes. The free tier includes 50 credits on signup, which is more than enough to complete this tutorial and prototype a working solution.

What frameworks does this work with?

Scavio has a native LangChain package (langchain-scavio), an MCP server, and a plain REST API that works with any HTTP client. This tutorial uses the raw REST API, but you can adapt to your framework of choice.

Migrate n8n Scraping Nodes to API (2026)

n8n HTTP Request nodes that scrape websites break constantly due to HTML changes, CAPTCHAs, and rate limits. Replacing them with structured API calls returns clean JSON, never breaks on layout changes, and eliminates proxy costs. This tutorial migrates common n8n scraping patterns to API calls node by node.

Prerequisites

n8n instance running
A Scavio API key from scavio.dev
Existing n8n workflows with HTTP scraping nodes
Basic n8n workflow knowledge

Walkthrough

Step 1: Identify scraping nodes to replace

Export your n8n workflow and find HTTP Request nodes that scrape websites.

Python

import json, os, requests

API_KEY = os.environ['SCAVIO_API_KEY']
SH = {'x-api-key': API_KEY, 'Content-Type': 'application/json'}

# Analyze an n8n workflow export for scraping nodes
def find_scraping_nodes(workflow_json):
    nodes = workflow_json.get('nodes', [])
    scraping_nodes = []
    for node in nodes:
        if node.get('type') == 'n8n-nodes-base.httpRequest':
            url = node.get('parameters', {}).get('url', '')
            if any(site in url for site in ['google.com', 'amazon.com', 'reddit.com', 'bing.com']):
                scraping_nodes.append({
                    'name': node.get('name', 'unnamed'),
                    'url': url,
                    'type': 'replaceable'
                })
    return scraping_nodes

# Simulated workflow analysis
sample = {'nodes': [
    {'type': 'n8n-nodes-base.httpRequest', 'name': 'Scrape Google', 'parameters': {'url': 'https://google.com/search?q=test'}},
    {'type': 'n8n-nodes-base.httpRequest', 'name': 'Scrape Amazon', 'parameters': {'url': 'https://amazon.com/s?k=test'}},
    {'type': 'n8n-nodes-base.httpRequest', 'name': 'Internal API', 'parameters': {'url': 'https://api.mycompany.com/data'}},
]}

scrapers = find_scraping_nodes(sample)
print(f'Found {len(scrapers)} scraping nodes to replace:')
for s in scrapers:
    print(f'  {s["name"]}: {s["url"][:50]}')

Step 2: Create replacement API node configuration

Generate n8n HTTP Request node configs that use the search API instead.

Python

def generate_replacement_node(scraping_node):
    """Generate n8n node config for API replacement."""
    url = scraping_node['url']
    name = scraping_node['name']
    # Determine platform from URL
    platform = None
    if 'google.com' in url: platform = None  # default is Google
    elif 'amazon.com' in url: platform = 'amazon'
    elif 'reddit.com' in url: platform = 'reddit'
    body = {'query': '{{ $json.query }}', 'country_code': 'us'}
    if platform:
        body['platform'] = platform
    replacement = {
        'name': f'{name} (API)',
        'type': 'n8n-nodes-base.httpRequest',
        'parameters': {
            'method': 'POST',
            'url': 'https://api.scavio.dev/api/v1/search',
            'headers': {
                'x-api-key': '{{ $env.SCAVIO_API_KEY }}',
                'Content-Type': 'application/json'
            },
            'body': json.dumps(body),
            'responseFormat': 'json'
        }
    }
    return replacement

print('=== Replacement Nodes ===')
for s in scrapers:
    replacement = generate_replacement_node(s)
    print(f'\n{s["name"]} -> {replacement["name"]}')
    print(f'  URL: {replacement["parameters"]["url"]}')
    print(f'  Method: POST (was GET)')
    print(f'  Response: Clean JSON (was raw HTML)')

Step 3: Test replacement and compare output

Run both old and new approaches to verify data quality matches.

Python

def compare_outputs(query, platform=None):
    """Compare API output quality for the replacement."""
    body = {'query': query, 'country_code': 'us'}
    if platform:
        body['platform'] = platform
    data = requests.post('https://api.scavio.dev/api/v1/search',
        headers=SH, json=body).json()
    results = data.get('organic_results', [])
    print(f'\nQuery: "{query}" (platform: {platform or "google"})')
    print(f'  Results: {len(results)}')
    print(f'  Fields per result: {list(results[0].keys()) if results else "N/A"}')
    if results:
        print(f'  Sample: {results[0].get("title", "")[:50]}')
    print(f'  Format: Structured JSON (no HTML parsing needed)')
    print(f'  Cost: $0.005 per query')
    print(f'  Reliability: No CAPTCHAs, no proxy needed, no HTML changes')

compare_outputs('wireless earbuds review')
compare_outputs('wireless earbuds', platform='amazon')
compare_outputs('wireless earbuds recommendation', platform='reddit')

print(f'\n=== Migration Summary ===')
print(f'  Nodes to replace: {len(scrapers)}')
print(f'  Time to migrate: ~10 minutes per node')
print(f'  Monthly savings: proxy costs + maintenance time')

Python Example

Python

import os, requests
SH = {'x-api-key': os.environ['SCAVIO_API_KEY'], 'Content-Type': 'application/json'}

# Before: n8n HTTP scraping Google (breaks often)
# After: n8n HTTP Request to API (stable JSON)
data = requests.post('https://api.scavio.dev/api/v1/search',
    headers=SH, json={'query': 'wireless earbuds', 'country_code': 'us'}).json()
print(f'Results: {len(data.get("organic_results", []))}')
print(f'Format: JSON | No HTML parsing | $0.005/query')

JavaScript Example

JavaScript

const SH = { 'x-api-key': process.env.SCAVIO_API_KEY, 'Content-Type': 'application/json' };
const data = await fetch('https://api.scavio.dev/api/v1/search', {
  method: 'POST', headers: SH,
  body: JSON.stringify({ query: 'wireless earbuds', country_code: 'us' })
}).then(r => r.json());
console.log(`Results: ${(data.organic_results || []).length}`);
console.log('Format: JSON | No HTML parsing | $0.005/query');

Expected Output

JSON

Found 2 scraping nodes to replace:
  Scrape Google: https://google.com/search?q=test
  Scrape Amazon: https://amazon.com/s?k=test

Query: "wireless earbuds review" (platform: google)
  Results: 10
  Fields per result: ['title', 'link', 'snippet', 'position']
  Format: Structured JSON (no HTML parsing needed)
  Cost: $0.005 per query

=== Migration Summary ===
  Nodes to replace: 2
  Time to migrate: ~10 minutes per node
  Monthly savings: proxy costs + maintenance time

Prerequisites

n8n instance running
A Scavio API key from scavio.dev
Existing n8n workflows with HTTP scraping nodes
Basic n8n workflow knowledge

Walkthrough

Step 1: Identify scraping nodes to replace

Export your n8n workflow and find HTTP Request nodes that scrape websites.

Python

import json, os, requests

API_KEY = os.environ['SCAVIO_API_KEY']
SH = {'x-api-key': API_KEY, 'Content-Type': 'application/json'}

# Analyze an n8n workflow export for scraping nodes
def find_scraping_nodes(workflow_json):
    nodes = workflow_json.get('nodes', [])
    scraping_nodes = []
    for node in nodes:
        if node.get('type') == 'n8n-nodes-base.httpRequest':
            url = node.get('parameters', {}).get('url', '')
            if any(site in url for site in ['google.com', 'amazon.com', 'reddit.com', 'bing.com']):
                scraping_nodes.append({
                    'name': node.get('name', 'unnamed'),
                    'url': url,
                    'type': 'replaceable'
                })
    return scraping_nodes

# Simulated workflow analysis
sample = {'nodes': [
    {'type': 'n8n-nodes-base.httpRequest', 'name': 'Scrape Google', 'parameters': {'url': 'https://google.com/search?q=test'}},
    {'type': 'n8n-nodes-base.httpRequest', 'name': 'Scrape Amazon', 'parameters': {'url': 'https://amazon.com/s?k=test'}},
    {'type': 'n8n-nodes-base.httpRequest', 'name': 'Internal API', 'parameters': {'url': 'https://api.mycompany.com/data'}},
]}

scrapers = find_scraping_nodes(sample)
print(f'Found {len(scrapers)} scraping nodes to replace:')
for s in scrapers:
    print(f'  {s["name"]}: {s["url"][:50]}')

Step 2: Create replacement API node configuration

Generate n8n HTTP Request node configs that use the search API instead.

Python

def generate_replacement_node(scraping_node):
    """Generate n8n node config for API replacement."""
    url = scraping_node['url']
    name = scraping_node['name']
    # Determine platform from URL
    platform = None
    if 'google.com' in url: platform = None  # default is Google
    elif 'amazon.com' in url: platform = 'amazon'
    elif 'reddit.com' in url: platform = 'reddit'
    body = {'query': '{{ $json.query }}', 'country_code': 'us'}
    if platform:
        body['platform'] = platform
    replacement = {
        'name': f'{name} (API)',
        'type': 'n8n-nodes-base.httpRequest',
        'parameters': {
            'method': 'POST',
            'url': 'https://api.scavio.dev/api/v1/search',
            'headers': {
                'x-api-key': '{{ $env.SCAVIO_API_KEY }}',
                'Content-Type': 'application/json'
            },
            'body': json.dumps(body),
            'responseFormat': 'json'
        }
    }
    return replacement

print('=== Replacement Nodes ===')
for s in scrapers:
    replacement = generate_replacement_node(s)
    print(f'\n{s["name"]} -> {replacement["name"]}')
    print(f'  URL: {replacement["parameters"]["url"]}')
    print(f'  Method: POST (was GET)')
    print(f'  Response: Clean JSON (was raw HTML)')

Step 3: Test replacement and compare output

Run both old and new approaches to verify data quality matches.

Python

def compare_outputs(query, platform=None):
    """Compare API output quality for the replacement."""
    body = {'query': query, 'country_code': 'us'}
    if platform:
        body['platform'] = platform
    data = requests.post('https://api.scavio.dev/api/v1/search',
        headers=SH, json=body).json()
    results = data.get('organic_results', [])
    print(f'\nQuery: "{query}" (platform: {platform or "google"})')
    print(f'  Results: {len(results)}')
    print(f'  Fields per result: {list(results[0].keys()) if results else "N/A"}')
    if results:
        print(f'  Sample: {results[0].get("title", "")[:50]}')
    print(f'  Format: Structured JSON (no HTML parsing needed)')
    print(f'  Cost: $0.005 per query')
    print(f'  Reliability: No CAPTCHAs, no proxy needed, no HTML changes')

compare_outputs('wireless earbuds review')
compare_outputs('wireless earbuds', platform='amazon')
compare_outputs('wireless earbuds recommendation', platform='reddit')

print(f'\n=== Migration Summary ===')
print(f'  Nodes to replace: {len(scrapers)}')
print(f'  Time to migrate: ~10 minutes per node')
print(f'  Monthly savings: proxy costs + maintenance time')

Python Example

Python

import os, requests
SH = {'x-api-key': os.environ['SCAVIO_API_KEY'], 'Content-Type': 'application/json'}

# Before: n8n HTTP scraping Google (breaks often)
# After: n8n HTTP Request to API (stable JSON)
data = requests.post('https://api.scavio.dev/api/v1/search',
    headers=SH, json={'query': 'wireless earbuds', 'country_code': 'us'}).json()
print(f'Results: {len(data.get("organic_results", []))}')
print(f'Format: JSON | No HTML parsing | $0.005/query')

JavaScript Example

JavaScript

const SH = { 'x-api-key': process.env.SCAVIO_API_KEY, 'Content-Type': 'application/json' };
const data = await fetch('https://api.scavio.dev/api/v1/search', {
  method: 'POST', headers: SH,
  body: JSON.stringify({ query: 'wireless earbuds', country_code: 'us' })
}).then(r => r.json());
console.log(`Results: ${(data.organic_results || []).length}`);
console.log('Format: JSON | No HTML parsing | $0.005/query');

Expected Output

JSON

Found 2 scraping nodes to replace:
  Scrape Google: https://google.com/search?q=test
  Scrape Amazon: https://amazon.com/s?k=test

Query: "wireless earbuds review" (platform: google)
  Results: 10
  Fields per result: ['title', 'link', 'snippet', 'position']
  Format: Structured JSON (no HTML parsing needed)
  Cost: $0.005 per query

=== Migration Summary ===
  Nodes to replace: 2
  Time to migrate: ~10 minutes per node
  Monthly savings: proxy costs + maintenance time

How to Migrate n8n Scraping Nodes to API

Prerequisites

Walkthrough

Step 1: Identify scraping nodes to replace

Step 2: Create replacement API node configuration

Step 3: Test replacement and compare output

Python Example

JavaScript Example

Expected Output

Related Tutorials

Frequently Asked Questions

How long does this migrate n8n scraping nodes to api tutorial take?

What do I need before starting?

Can I run this tutorial with the free tier?

What frameworks does this work with?

Related Resources

n8n Scraping to API Migration

n8n Search Enrichment Workflow

Best n8n Search API Nodes Comparison (May 2026)

Best Search API for n8n Integration in 2026

Web Scraping in n8n (HTTP Request + HTML Extract) vs Search API in n8n (HTTP Request to search API)

Search APIs (Scavio, Tavily, SerpAPI) vs Headless Browser (Playwright, Puppeteer, Browserbase)

Start Building

How to Migrate n8n Scraping Nodes to API

Prerequisites

Walkthrough

Step 1: Identify scraping nodes to replace

Step 2: Create replacement API node configuration

Step 3: Test replacement and compare output

Python Example

JavaScript Example

Expected Output

Related Tutorials

Frequently Asked Questions

How long does this migrate n8n scraping nodes to api tutorial take?

What do I need before starting?

Can I run this tutorial with the free tier?

What frameworks does this work with?

Related Resources

n8n Scraping to API Migration

n8n Search Enrichment Workflow

Best n8n Search API Nodes Comparison (May 2026)

Best Search API for n8n Integration in 2026

Web Scraping in n8n (HTTP Request + HTML Extract) vs Search API in n8n (HTTP Request to search API)

Search APIs (Scavio, Tavily, SerpAPI) vs Headless Browser (Playwright, Puppeteer, Browserbase)

Start Building