Tutorial

How to Build a Scraping Tool with a Local Uncensored LLM and Scavio

Pair a locally hosted uncensored LLM with Scavio for a fully local scraping tool that handles extraction prompts without content filters.

r/LocalLLaMA 2026 has regular threads on pairing local uncensored models (Dolphin, Wizard, Nous) with a cloud-hosted scraping backend. The split is deliberate: the LLM runs locally for privacy and flexibility, Scavio handles the hard scraping infrastructure. This tutorial builds that architecture.

Prerequisites

  • Ollama or llama.cpp
  • A local uncensored model (dolphin-mixtral, nous-hermes)
  • A Scavio API key
  • Python 3.10+

Walkthrough

Step 1: Run the local LLM

Ollama makes this one command.

Bash
ollama run dolphin-mixtral

Step 2: Call Scavio for the scraping layer

Scavio handles the network/anti-bot side.

Python
import requests, os
API_KEY = os.environ['SCAVIO_API_KEY']

def fetch(url):
    r = requests.post('https://api.scavio.dev/api/v1/search',
        headers={'x-api-key': API_KEY},
        json={'query': url, 'platform': 'extract', 'render_js': True})
    return r.json().get('html', '')

Step 3: Let the local LLM do extraction

No content filters on the extraction prompt.

Python
import requests
def extract_with_local(html, instruction):
    r = requests.post('http://localhost:11434/api/generate',
        json={'model': 'dolphin-mixtral', 'prompt': f'{instruction}\n\nHTML:\n{html[:4000]}'})
    return r.json()['response']

Step 4: Wire the full pipeline

Fetch via Scavio, extract via local LLM.

Python
def scrape(url, instruction):
    html = fetch(url)
    return extract_with_local(html, instruction)

print(scrape('https://target.com', 'Extract all product names and prices as JSON.'))

Step 5: Validate output

Light schema check before saving.

Python
import json
def validate(out):
    try: json.loads(out); return True
    except: return False

Python Example

Python
import os, requests

API_KEY = os.environ['SCAVIO_API_KEY']

def scrape(url, instruction):
    html = requests.post('https://api.scavio.dev/api/v1/search',
        headers={'x-api-key': API_KEY},
        json={'query': url, 'platform': 'extract', 'render_js': True}).json().get('html', '')
    r = requests.post('http://localhost:11434/api/generate',
        json={'model': 'dolphin-mixtral', 'prompt': f'{instruction}\n\n{html[:4000]}'})
    return r.json().get('response', '')

print(scrape('https://example.com', 'List headings as JSON'))

JavaScript Example

JavaScript
const API_KEY = process.env.SCAVIO_API_KEY;
export async function scrape(url, instruction) {
  const s = await fetch('https://api.scavio.dev/api/v1/search', {
    method: 'POST',
    headers: { 'x-api-key': API_KEY, 'Content-Type': 'application/json' },
    body: JSON.stringify({ query: url, platform: 'extract', render_js: true })
  });
  const html = (await s.json()).html || '';
  const o = await fetch('http://localhost:11434/api/generate', {
    method: 'POST',
    body: JSON.stringify({ model: 'dolphin-mixtral', prompt: `${instruction}\n\n${html.slice(0, 4000)}` })
  });
  return (await o.json()).response;
}

Expected Output

JSON
Fully local extraction logic with cloud-hosted scraping infrastructure. Per-page cost: 1 Scavio credit + local GPU time. Data never leaves local box except the URL.

Related Tutorials

Frequently Asked Questions

Most developers complete this tutorial in 15 to 30 minutes. You will need a Scavio API key (free tier works) and a working Python or JavaScript environment.

Ollama or llama.cpp. A local uncensored model (dolphin-mixtral, nous-hermes). A Scavio API key. Python 3.10+. A Scavio API key gives you 500 free credits per month.

Yes. The free tier includes 500 credits per month, which is more than enough to complete this tutorial and prototype a working solution.

Scavio has a native LangChain package (langchain-scavio), an MCP server, and a plain REST API that works with any HTTP client. This tutorial uses the raw REST API, but you can adapt to your framework of choice.

Start Building

Pair a locally hosted uncensored LLM with Scavio for a fully local scraping tool that handles extraction prompts without content filters.