ragllm-wikiscavio

Karpathy LLM Wiki Stack in 2026

An r/AI_Agents post asked for tools. The minimum stack: Scavio + Qdrant + LLM. Three vendors, four ingestion surfaces, citation-grounded answers.

5 min read

An r/AI_Agents post asked specifically about content, web-scraping, web search tools, ingestion libraries, or MCPs suited to a Karpathy-style LLM Wiki. The post is short, the answers in the thread are scattered. The practical 2026 stack is more concrete than the thread suggests.

What a Karpathy-style LLM Wiki ingests

Four surfaces, in roughly this order of usefulness:

  • Web SERP for top-ranked summaries of any topic
  • Reddit for community discussion and edge cases
  • YouTube for lecture transcripts and demos
  • Arxiv for primary papers

The minimum stack handles all four with citation-grade source URLs and embedding-ready markdown.

The 2026 minimum stack

  • Scavio for search + extract across web/Reddit/YouTube
  • Firecrawl for whole-site crawls of paginated docs (e.g., docs.python.org)
  • Qdrant Cloud for vector storage
  • Jina or OpenAI for embeddings
  • Claude/GPT/DeepSeek for citation-grounded answers

Verified pricing for the data layer (2026-04):

  • Scavio: $30/mo for 7K credits ($0.0043 per search or extract)
  • Firecrawl Hobby: $16/mo for 3K credits
  • Qdrant Cloud: free 1GB; from $25/mo
  • Jina embeddings: free tier; paid PAYG

The ingestion loop

Python
import requests, os
H = {'x-api-key': os.environ['SCAVIO_API_KEY']}

def discover(topic):
    """Pull candidate sources from Google + Reddit + YouTube under one key."""
    return {
        'web': requests.post('https://api.scavio.dev/api/v1/search',
            headers=H, json={'query': topic}).json().get('organic_results', []),
        'reddit': requests.post('https://api.scavio.dev/api/v1/reddit/search',
            headers=H, json={'query': topic}).json().get('posts', []),
        'youtube': requests.post('https://api.scavio.dev/api/v1/youtube/search',
            headers=H, json={'query': topic}).json().get('videos', []),
    }

def extract(url):
    """Markdown-ready content for embedding."""
    return requests.post('https://api.scavio.dev/api/v1/extract',
        headers=H, json={'url': url, 'format': 'markdown'}).json().get('markdown', '')

Citation-grounded answers

The wiki's value is sourced answers, not just retrieval. A strict citation prompt enforces this:

Python
PROMPT = '''Answer using ONLY the sources. Every claim followed by [N] (the source number).
If sources don't answer, say "I don't know based on these sources."

Sources:
{src}

Question: {q}'''

# At render time, [1] becomes a clickable link to source URL.

Cost math at typical wiki volumes

100 wiki questions/day, each requiring 5 search + 3 extract + 1 LLM call:

  • Scavio: 800 credits/day = ~24K credits/mo. Sits inside Bootstrap tier ($100/mo for 28K).
  • Qdrant: free tier covers ~1M points
  • LLM tokens: ~$0.05/answer = $5/day = $150/mo at typical Claude Sonnet rates

Total: ~$250/mo for a 100-questions/day wiki with citation-grounded answers across web/Reddit/YouTube.

Honest constraints

Scavio doesn't replace the vector store, the embedding model, or the LLM. It replaces the search + extract layer — usually 3-5 vendors before consolidation. The wiki still needs Qdrant or Weaviate for semantic recall and a model for embeddings.

Whole-site recursive crawls (e.g., ingest all of docs.python.org) belong in Firecrawl's crawl mode. Scavio is per-URL extract, not site walker.

Why this stack vs alternatives

The Tavily + Reddit scraper + YouTube Data API + Firecrawl + Qdrant + LLM stack is six vendors. The Scavio + Qdrant + LLM stack is three. Vendor sprawl eats more time than per-call cost saves; consolidation is the real win.