Tutorial

How to Build an n8n LLM Research Pipeline with Search and Extract

Build an n8n workflow that searches and extracts content for an LLM, all under one Scavio API key. Replaces Tavily + Firecrawl with one vendor.

An r/n8n thread asked for a search API that integrates search plus content extraction for LLM pipelines. This tutorial wires the full n8n flow: Scavio search → Scavio extract → LLM summary → output.

Prerequisites

  • n8n (Cloud or self-hosted)
  • Scavio API key

Walkthrough

Step 1: Webhook Trigger or Cron

Webhook for ad-hoc research; Cron for daily digest.

Text
// n8n trigger node — Webhook with topic input

Step 2: HTTP Request — Scavio search

POST /api/v1/search with topic.

Text
Method: POST
URL: https://api.scavio.dev/api/v1/search
Headers: x-api-key = {{ $env.SCAVIO_API_KEY }}
Body: { "query": "{{ $json.topic }}" }

Step 3: Iterate over top 5 results

Split-In-Batches node, batch size 1.

Text
// n8n: Split In Batches → loop over organic_results[0..5]

Step 4: HTTP Request — Scavio extract per result

POST /api/v1/extract per URL.

Text
Body: { "url": "{{ $json.link }}", "format": "markdown" }

Step 5: LLM node summarizes

Claude / GPT / Groq.

Text
Prompt: 'Summarize this content into a 200-word brief: {{ $json.markdown }}'

Step 6: Aggregate + output

Merge all 5 briefs into a single response.

Text
// n8n: Merge node, then Respond to Webhook with the combined brief.

Python Example

Python
# Out of n8n the equivalent script:
import os, requests
API_KEY = os.environ['SCAVIO_API_KEY']
H = {'x-api-key': API_KEY}

def research(topic):
    s = requests.post('https://api.scavio.dev/api/v1/search', headers=H, json={'query': topic}).json()
    out = []
    for r in s.get('organic_results', [])[:5]:
        e = requests.post('https://api.scavio.dev/api/v1/extract', headers=H, json={'url': r['link'], 'format': 'markdown'}).json()
        out.append({'url': r['link'], 'md': e.get('markdown', '')[:3000]})
    return out

print(len(research('mcp server best practices 2026')))

JavaScript Example

JavaScript
// n8n is config-driven. See node-by-node steps above.

Expected Output

JSON
Per request: 6 Scavio calls (1 search + 5 extracts) ≈ $0.026. Five summarized sources ready for an LLM context window.

Related Tutorials

Frequently Asked Questions

Most developers complete this tutorial in 15 to 30 minutes. You will need a Scavio API key (free tier works) and a working Python or JavaScript environment.

n8n (Cloud or self-hosted). Scavio API key. A Scavio API key gives you 500 free credits per month.

Yes. The free tier includes 500 credits per month, which is more than enough to complete this tutorial and prototype a working solution.

Scavio has a native LangChain package (langchain-scavio), an MCP server, and a plain REST API that works with any HTTP client. This tutorial uses the raw REST API, but you can adapt to your framework of choice.

Start Building

Build an n8n workflow that searches and extracts content for an LLM, all under one Scavio API key. Replaces Tavily + Firecrawl with one vendor.