Workflow

Deep Research Daily Pipeline

Daily deep research agent pipeline using search, extraction, and structured analysis. Replace multi-tool stacks with one API.

Overview

This workflow runs a daily deep research pipeline that searches across Google, Reddit, and YouTube for target topics, extracts key content from top results, and compiles structured research briefs. It replaces a multi-tool stack (Serper + Jina + E2B) with a single API for search and extraction.

Trigger

Cron schedule (daily at 5:00 AM UTC)

Schedule

Runs daily at 5:00 AM UTC

Workflow Steps

1

Load research topics

Read the daily research topic list from configuration. Topics can be static keywords or dynamically generated from previous day's signals.

2

Multi-platform search

Search each topic on Google, Reddit, and YouTube to gather diverse perspectives and source types.

3

Extract top result content

Use Scavio extract endpoint to pull full content from the top 3 Google results for each topic.

4

Compile research brief

Combine search results and extracted content into a structured research brief per topic.

5

Archive and notify

Save research briefs to archive and send summary notification via webhook or email.

Python Implementation

Python
import requests
import json
from pathlib import Path
from datetime import datetime

API_KEY = "your_scavio_api_key"
BASE = "https://api.scavio.dev/api/v1"

TOPICS = ["AI agent search tools 2026", "SERP API pricing changes", "MCP server adoption"]

def search_platform(query: str, platform: str) -> list[dict]:
    res = requests.post(
        f"{BASE}/search",
        headers={"x-api-key": API_KEY},
        json={"platform": platform, "query": query},
        timeout=15,
    )
    res.raise_for_status()
    return res.json().get("organic", [])

def extract_content(url: str) -> dict:
    res = requests.post(
        f"{BASE}/extract",
        headers={"x-api-key": API_KEY},
        json={"url": url},
        timeout=30,
    )
    res.raise_for_status()
    return res.json()

def research_topic(topic: str) -> dict:
    google_results = search_platform(topic, "google")
    reddit_results = search_platform(topic, "reddit")
    youtube_results = search_platform(topic, "youtube")

    # Extract top 3 Google results
    extracted = []
    for result in google_results[:3]:
        url = result.get("link", "")
        if url:
            try:
                content = extract_content(url)
                extracted.append({
                    "url": url,
                    "title": result.get("title", ""),
                    "content_preview": content.get("text", "")[:500],
                })
            except Exception:
                pass

    return {
        "topic": topic,
        "google_count": len(google_results),
        "reddit_count": len(reddit_results),
        "youtube_count": len(youtube_results),
        "extracted_pages": len(extracted),
        "top_reddit": [{"title": r.get("title", ""), "score": r.get("score", 0)} for r in reddit_results[:5]],
        "top_youtube": [{"title": r.get("title", ""), "views": r.get("views", 0)} for r in youtube_results[:5]],
        "extracted": extracted,
    }

def run():
    date = datetime.utcnow().strftime("%Y-%m-%d")
    briefs = [research_topic(t) for t in TOPICS]
    total_credits = sum(3 + b["extracted_pages"] for b in briefs)  # 3 searches + extractions per topic
    report = {"date": date, "topics": len(TOPICS), "credits_used": total_credits, "briefs": briefs}
    Path(f"research_{date}.json").write_text(json.dumps(report, indent=2))
    print(f"Research complete: {len(TOPICS)} topics, {total_credits} credits")
    for brief in briefs:
        print(f"  {brief['topic']}: {brief['google_count']}G {brief['reddit_count']}R {brief['youtube_count']}Y {brief['extracted_pages']}E")

if __name__ == "__main__":
    run()

JavaScript Implementation

JavaScript
const API_KEY = "your_scavio_api_key";
const BASE = "https://api.scavio.dev/api/v1";
const TOPICS = ["AI agent search tools 2026", "SERP API pricing changes", "MCP server adoption"];

async function search(query, platform) {
  const res = await fetch(`${BASE}/search`, {
    method: "POST",
    headers: { "x-api-key": API_KEY, "content-type": "application/json" },
    body: JSON.stringify({ platform, query }),
  });
  if (!res.ok) throw new Error(`scavio ${res.status}`);
  return (await res.json()).organic ?? [];
}

async function extract(url) {
  const res = await fetch(`${BASE}/extract`, {
    method: "POST",
    headers: { "x-api-key": API_KEY, "content-type": "application/json" },
    body: JSON.stringify({ url }),
  });
  if (!res.ok) return null;
  return res.json();
}

async function run() {
  const fs = await import("fs/promises");
  const briefs = [];
  for (const topic of TOPICS) {
    const [google, reddit, youtube] = await Promise.all([
      search(topic, "google"), search(topic, "reddit"), search(topic, "youtube"),
    ]);
    const extracted = [];
    for (const r of google.slice(0, 3)) {
      if (r.link) { const c = await extract(r.link); if (c) extracted.push({ url: r.link, title: r.title ?? "", preview: (c.text ?? "").slice(0, 500) }); }
    }
    briefs.push({ topic, google: google.length, reddit: reddit.length, youtube: youtube.length, extracted: extracted.length });
  }
  const date = new Date().toISOString().slice(0, 10);
  await fs.writeFile(`research_${date}.json`, JSON.stringify(briefs, null, 2));
  for (const b of briefs) console.log(`  ${b.topic}: ${b.google}G ${b.reddit}R ${b.youtube}Y ${b.extracted}E`);
}

run();

Platforms Used

Google

Web search with knowledge graph, PAA, and AI overviews

YouTube

Video search with transcripts and metadata

Reddit

Community, posts & threaded comments from any subreddit

Frequently Asked Questions

This workflow runs a daily deep research pipeline that searches across Google, Reddit, and YouTube for target topics, extracts key content from top results, and compiles structured research briefs. It replaces a multi-tool stack (Serper + Jina + E2B) with a single API for search and extraction.

This workflow uses a cron schedule (daily at 5:00 am utc). Runs daily at 5:00 AM UTC.

This workflow uses the following Scavio platforms: google, youtube, reddit. Each platform is called via the same unified API endpoint.

Yes. Scavio's free tier includes 250 credits per month with no credit card required. That is enough to test and validate this workflow before scaling it.

Deep Research Daily Pipeline

Daily deep research agent pipeline using search, extraction, and structured analysis. Replace multi-tool stacks with one API.