Tutorial

How to Build a Karpathy-Style LLM Wiki RAG Agent

An r/AI_Agents post asked for tools to build a Karpathy-style LLM Wiki. Step-by-step stack: Scavio + extract + Qdrant + LLM with citations.

An r/AI_Agents post asked specifically about tools for a Karpathy-style LLM Wiki: search, scraping, MCPs, ingestion. This walks the minimum stack with verified-online costs.

Prerequisites

  • Python 3.10+
  • Scavio API key
  • Qdrant Cloud free tier or self-hosted Qdrant
  • An LLM API (Claude/OpenAI/DeepSeek)

Walkthrough

Step 1: Discover sources via Scavio search

For a topic, get top SERP + top Reddit threads + top YouTube videos.

Python
import requests, os
H = {'x-api-key': os.environ['SCAVIO_API_KEY']}

def discover(topic):
    return {
        'web': requests.post('https://api.scavio.dev/api/v1/search', headers=H, json={'query': topic}).json(),
        'reddit': requests.post('https://api.scavio.dev/api/v1/reddit/search', headers=H, json={'query': topic}).json(),
        'youtube': requests.post('https://api.scavio.dev/api/v1/youtube/search', headers=H, json={'query': topic}).json(),
    }

Step 2: Extract clean markdown for top sources

Per source, /extract returns markdown ready for embedding.

Python
def extract(url):
    return requests.post('https://api.scavio.dev/api/v1/extract',
        headers=H, json={'url': url, 'format': 'markdown'}).json()

Step 3: Embed and store in Qdrant

Chunk markdown, embed, upsert with source URL as payload.

Python
from qdrant_client import QdrantClient
from qdrant_client.models import PointStruct
client = QdrantClient(url='https://your-qdrant.cloud')
# embed_fn = your embedding function (OpenAI/Cohere/Jina)
for i, chunk in enumerate(chunks):
    client.upsert(collection_name='wiki', points=[PointStruct(
        id=i, vector=embed_fn(chunk), payload={'text': chunk, 'url': source_url})])

Step 4: Query with citation prompt

LLM emits [N] markers tied to chunk source URLs.

Python
def answer(question, k=5):
    hits = client.search(collection_name='wiki', query_vector=embed_fn(question), limit=k)
    sources = [{'i': i+1, 'text': h.payload['text'], 'url': h.payload['url']} for i, h in enumerate(hits)]
    prompt = f'Question: {question}\nSources:\n' + '\n'.join(f'[{s["i"]}] {s["url"]}: {s["text"][:300]}' for s in sources)
    prompt += '\nAnswer with [N] citations referencing sources.'
    return llm.complete(prompt), sources

Step 5: Render with clickable citations

[1] becomes a link to the source URL.

Python
import re
def render(answer, sources):
    for s in sources:
        answer = answer.replace(f'[{s["i"]}]', f'[[{s["i"]}]]({s["url"]})')
    return answer

Python Example

Python
# Cost per question: ~5 search credits + ~3 extract credits + 1 LLM call = ~$0.04-0.10

JavaScript Example

JavaScript
// Same flow in TS using qdrant-js + Scavio fetch calls.

Expected Output

JSON
LLM Wiki agent that pulls from Google + Reddit + YouTube under one Scavio key, embeds into Qdrant, answers with clickable citations. Stack cost: Scavio $30 + Qdrant Cloud ~$25 + LLM tokens.

Related Tutorials

Frequently Asked Questions

Most developers complete this tutorial in 15 to 30 minutes. You will need a Scavio API key (free tier works) and a working Python or JavaScript environment.

Python 3.10+. Scavio API key. Qdrant Cloud free tier or self-hosted Qdrant. An LLM API (Claude/OpenAI/DeepSeek). A Scavio API key gives you 500 free credits per month.

Yes. The free tier includes 500 credits per month, which is more than enough to complete this tutorial and prototype a working solution.

Scavio has a native LangChain package (langchain-scavio), an MCP server, and a plain REST API that works with any HTTP client. This tutorial uses the raw REST API, but you can adapt to your framework of choice.

Start Building

An r/AI_Agents post asked for tools to build a Karpathy-style LLM Wiki. Step-by-step stack: Scavio + extract + Qdrant + LLM with citations.