Tutorial

How to Build a Deep Research Agent for Book Projects

Authors and analysts: build an agent that researches like a junior researcher across SERP, Reddit, and YouTube with full citation tracking.

Off-the-shelf 'deep research' modes fall short for book projects. The right deep-research agent runs across SERP, Reddit, and YouTube, follows citations, weighs sources, and keeps a running citation list. This tutorial wires Scavio plus Claude into a multi-step research loop with DuckDB citations.

Prerequisites

  • Python 3.10+
  • Scavio API key
  • Anthropic API key
  • DuckDB (pip install duckdb)

Walkthrough

Step 1: Spawn the research plan

Claude breaks the topic into 5 sub-questions.

Python
import anthropic
client = anthropic.Anthropic()

def plan(topic):
    msg = client.messages.create(
        model='claude-sonnet-4-6', max_tokens=1024,
        messages=[{'role':'user','content':f'Break into 5 sub-questions: {topic}'}])
    return msg.content[0].text.split('\n')

Step 2: Search per sub-question

Scavio multi-surface for each sub-question.

Python
import requests, os
API_KEY = os.environ['SCAVIO_API_KEY']

def multi_search(q):
    serp = requests.post('https://api.scavio.dev/api/v1/google',
        headers={'x-api-key': API_KEY}, json={'query': q}).json()
    rdt = requests.post('https://api.scavio.dev/api/v1/reddit/search',
        headers={'x-api-key': API_KEY}, json={'query': q}).json()
    yt = requests.post('https://api.scavio.dev/api/v1/youtube/search',
        headers={'x-api-key': API_KEY}, json={'query': q}).json()
    return {'serp': serp.get('organic_results',[])[:5], 'reddit': rdt.get('posts',[])[:5], 'youtube': yt.get('videos',[])[:3]}

Step 3: Fetch top sources

Scavio extract for the highest-scored URLs.

Python
def fetch(url):
    r = requests.post('https://api.scavio.dev/api/v1/extract',
        headers={'x-api-key': API_KEY}, json={'url': url})
    return r.json().get('content', '')[:5000]

Step 4: Store citations

DuckDB table tracks every source the agent cited.

Python
import duckdb
db = duckdb.connect('research.duckdb')
db.execute('CREATE TABLE IF NOT EXISTS cites(topic TEXT, sub TEXT, url TEXT, surface TEXT, snippet TEXT)')

def cite(topic, sub, url, surface, snippet):
    db.execute('INSERT INTO cites VALUES (?, ?, ?, ?, ?)', (topic, sub, url, surface, snippet))

Step 5: Compose the brief

Claude writes a structured brief with footnotes.

Python
def brief(topic, plan_items):
    sources = db.execute('SELECT * FROM cites WHERE topic=?', (topic,)).fetchall()
    msg = client.messages.create(
        model='claude-sonnet-4-6', max_tokens=2048,
        messages=[{'role':'user','content':f'Write a brief on {topic} with footnotes from sources: {sources[:30]}'}])
    return msg.content[0].text

Python Example

Python
# Combine all steps into one entry function
def deep_research(topic):
    items = plan(topic)
    for sub in items:
        results = multi_search(sub)
        for r in results['serp'] + results['reddit']:
            cite(topic, sub, r.get('link') or r.get('url',''), 'web', r.get('snippet','')[:200])
    return brief(topic, items)

print(deep_research('history of llm agent architectures 2022-2026'))

JavaScript Example

JavaScript
const API_KEY = process.env.SCAVIO_API_KEY;
export async function multiSearch(q) {
  const headers = { 'x-api-key': API_KEY, 'Content-Type': 'application/json' };
  const [serp, rdt, yt] = await Promise.all([
    fetch('https://api.scavio.dev/api/v1/google', { method:'POST', headers, body: JSON.stringify({ query: q }) }).then(r => r.json()),
    fetch('https://api.scavio.dev/api/v1/reddit/search', { method:'POST', headers, body: JSON.stringify({ query: q }) }).then(r => r.json()),
    fetch('https://api.scavio.dev/api/v1/youtube/search', { method:'POST', headers, body: JSON.stringify({ query: q }) }).then(r => r.json())
  ]);
  return { serp, rdt, yt };
}

Expected Output

JSON
A structured research brief with 30+ citations across SERP, Reddit threads, and YouTube videos. DuckDB stores every source for follow-up queries.

Related Tutorials

Frequently Asked Questions

Most developers complete this tutorial in 15 to 30 minutes. You will need a Scavio API key (free tier works) and a working Python or JavaScript environment.

Python 3.10+. Scavio API key. Anthropic API key. DuckDB (pip install duckdb). A Scavio API key gives you 500 free credits per month.

Yes. The free tier includes 500 credits per month, which is more than enough to complete this tutorial and prototype a working solution.

Scavio has a native LangChain package (langchain-scavio), an MCP server, and a plain REST API that works with any HTTP client. This tutorial uses the raw REST API, but you can adapt to your framework of choice.

Start Building

Authors and analysts: build an agent that researches like a junior researcher across SERP, Reddit, and YouTube with full citation tracking.