Overview
Pre-LLM hop that converts URLs to markdown via Scavio /extract before the LLM sees them. Cuts input tokens ~10x for HTML-heavy tasks.
Trigger
Per-URL processing in any agent loop
Schedule
Per-task
Workflow Steps
Receive URL list
From SERP results or user input.
Scavio /extract per URL
POST with {url, format: 'markdown'}.
Optional cache hit
If markdown was extracted in last 24h, return cached.
Pass markdown to LLM
LLM context now ~3K tokens per page instead of ~30K.
LLM produces output
Summary, classification, extraction, or whatever the task is.
Optional second-pass extract
If markdown is too long, re-extract with summary mode or chunk.
Python Implementation
import os, requests
H = {'x-api-key': os.environ['SCAVIO_API_KEY']}
def extract(url):
return requests.post('https://api.scavio.dev/api/v1/extract', headers=H, json={'url': url, 'format': 'markdown'}).json().get('markdown', '')JavaScript Implementation
const H = { 'x-api-key': process.env.SCAVIO_API_KEY, 'Content-Type': 'application/json' };
async function extract(url) {
const r = await fetch('https://api.scavio.dev/api/v1/extract', { method:'POST', headers:H, body: JSON.stringify({ url, format: 'markdown' }) }).then(r => r.json());
return r.markdown || '';
}Platforms Used
Web search with knowledge graph, PAA, and AI overviews