n8n LLM flows often need to read article content. Without extraction, the flow either skips the content or chokes the LLM on raw HTML. This tutorial wires Scavio /extract as a single HTTP node.
Prerequisites
- n8n cloud or self-hosted
- Scavio API key
Walkthrough
Step 1: Add HTTP Request node before the LLM node
Plain HTTP, no plugin.
# URL: https://api.scavio.dev/api/v1/extract
# Method: POST
# Header: x-api-key: $SCAVIO_API_KEY
# Body: {"url": "{{$json.url}}", "format": "markdown"}Step 2: Pass markdown to the LLM node
Body becomes the user message.
# In LLM node body, reference {{$node['HTTP Request'].json.markdown}}.Step 3: Strip boilerplate (optional)
Function node trim if needed.
// Function node:
return [{json: {markdown: $input.first().json.markdown.replace(/(\[(skip to|navigation)\]\(.*?\)|\bcookie\b.*?policy)/gi, '')}}]Step 4: Add a fallback path
If extract returns empty.
# IF node: if markdown.length < 200, route to Browserbase or notify.Step 5: Test on representative URLs
Articles, blog posts, Reddit threads.
# Confirm markdown is clean and the LLM produces grounded output.Python Example
# Per URL: 1 credit = $0.0043. Free 500/mo handles ~15 URLs/day at $0.JavaScript Example
// Same architecture in n8n's JS code nodes.Expected Output
n8n LLM flows now read article content cleanly. Token usage in the LLM node drops sharply versus raw-HTML alternatives.