ScavioScavio
FeaturesPricingDocs
Sign InGet Started
  1. Home
  2. Tutorials
  3. How to Build a Multi-Source News Aggregator with APIs
Tutorial

How to Build a Multi-Source News Aggregator with APIs

Build a news aggregator that pulls articles from Google News, Reddit, and YouTube using the Scavio API. Deduplicate and rank by relevance across sources.

Get Free API KeyAPI Docs

A news aggregator that combines Google News articles, Reddit discussions, and YouTube videos for any topic gives a more complete picture than any single source. Google News provides editorial coverage, Reddit surfaces community reactions, and YouTube captures video commentary. This tutorial builds a multi-source aggregator using the Scavio API that queries all three platforms, normalizes the results into a common format, deduplicates by URL, and ranks by a combined relevance score.

Prerequisites

  • Python 3.10 or higher
  • requests library installed
  • A Scavio API key
  • Topics or keywords to aggregate news for

Walkthrough

Step 1: Query all three sources

Fetch Google News results, Reddit posts, and YouTube videos for the same topic using the Scavio API.

Python
from concurrent.futures import ThreadPoolExecutor

def fetch_google_news(topic: str) -> list[dict]:
    r = requests.post(ENDPOINT, headers={"x-api-key": API_KEY},
                      json={"query": f"{topic} news", "country_code": "us"})
    r.raise_for_status()
    return r.json().get("news_results", r.json().get("organic_results", []))

def fetch_reddit(topic: str) -> list[dict]:
    r = requests.post(ENDPOINT, headers={"x-api-key": API_KEY},
                      json={"platform": "reddit", "query": topic})
    r.raise_for_status()
    return r.json().get("data", {}).get("posts", [])

def fetch_youtube(topic: str) -> list[dict]:
    r = requests.post(ENDPOINT, headers={"x-api-key": API_KEY},
                      json={"platform": "youtube", "query": topic})
    r.raise_for_status()
    return r.json().get("videos", [])

Step 2: Normalize results into a common format

Transform results from each platform into a uniform structure with title, url, source, and snippet.

Python
def normalize_google(item: dict) -> dict:
    return {"title": item.get("title"), "url": item.get("link"), "source": "google", "snippet": item.get("snippet", ""), "date": item.get("date")}

def normalize_reddit(post: dict) -> dict:
    return {"title": post.get("title"), "url": post.get("url"), "source": "reddit", "snippet": f"r/{post.get('subreddit', '')}", "date": post.get("timestamp")}

def normalize_youtube(video: dict) -> dict:
    return {"title": video.get("title"), "url": video.get("url"), "source": "youtube", "snippet": video.get("description", "")[:100], "date": video.get("published_at")}

Step 3: Deduplicate by URL

Remove duplicate entries that appear across sources using URL as the deduplication key.

Python
def deduplicate(items: list[dict]) -> list[dict]:
    seen = {}
    for item in items:
        url = item.get("url", "")
        if url and url not in seen:
            seen[url] = item
    return list(seen.values())

Step 4: Output the aggregated feed

Print the combined, deduplicated feed grouped by source for easy consumption.

Python
def aggregate(topic: str) -> list[dict]:
    with ThreadPoolExecutor(max_workers=3) as ex:
        g = ex.submit(fetch_google_news, topic)
        r = ex.submit(fetch_reddit, topic)
        y = ex.submit(fetch_youtube, topic)
    items = [normalize_google(i) for i in g.result()[:5]]
    items += [normalize_reddit(i) for i in r.result()[:5]]
    items += [normalize_youtube(i) for i in y.result()[:5]]
    return deduplicate(items)

Python Example

Python
import os
import requests
from concurrent.futures import ThreadPoolExecutor

API_KEY = os.environ.get("SCAVIO_API_KEY", "your_scavio_api_key")
ENDPOINT = "https://api.scavio.dev/api/v1/search"

def fetch(body: dict) -> dict:
    r = requests.post(ENDPOINT, headers={"x-api-key": API_KEY}, json=body)
    r.raise_for_status()
    return r.json()

def aggregate(topic: str) -> list[dict]:
    with ThreadPoolExecutor(max_workers=3) as ex:
        g = ex.submit(fetch, {"query": f"{topic} news", "country_code": "us"})
        r = ex.submit(fetch, {"platform": "reddit", "query": topic})
        y = ex.submit(fetch, {"platform": "youtube", "query": topic})
    items = []
    for i in (g.result().get("news_results") or g.result().get("organic_results", []))[:5]:
        items.append({"src": "google", "title": i.get("title"), "url": i.get("link")})
    for p in r.result().get("data", {}).get("posts", [])[:5]:
        items.append({"src": "reddit", "title": p.get("title"), "url": p.get("url")})
    for v in y.result().get("videos", [])[:5]:
        items.append({"src": "youtube", "title": v.get("title"), "url": v.get("url")})
    return items

if __name__ == "__main__":
    for item in aggregate("AI agents 2026"):
        print(f"[{item['src']:>7}] {item['title'][:60]}")

JavaScript Example

JavaScript
const API_KEY = process.env.SCAVIO_API_KEY || "your_scavio_api_key";
const ENDPOINT = "https://api.scavio.dev/api/v1/search";

async function call(body) {
  const res = await fetch(ENDPOINT, {
    method: "POST",
    headers: { "x-api-key": API_KEY, "Content-Type": "application/json" },
    body: JSON.stringify(body)
  });
  return res.json();
}

async function aggregate(topic) {
  const [g, r, y] = await Promise.all([
    call({ query: `${topic} news`, country_code: "us" }),
    call({ platform: "reddit", query: topic }),
    call({ platform: "youtube", query: topic })
  ]);
  const items = [];
  (g.news_results || g.organic_results || []).slice(0, 5).forEach(i => items.push({ src: "google", title: i.title }));
  (r.data?.posts || []).slice(0, 5).forEach(p => items.push({ src: "reddit", title: p.title }));
  (y.videos || []).slice(0, 5).forEach(v => items.push({ src: "youtube", title: v.title }));
  return items;
}

aggregate("AI agents 2026").then(items => items.forEach(i => console.log(`[${i.src}] ${i.title}`))).catch(console.error);

Expected Output

JSON
[ google] OpenAI Launches Agent Building Platform for Enterprise
[ google] Anthropic Expands Claude Agent Capabilities
[ reddit] Has anyone deployed AI agents in production yet?
[ reddit] Best frameworks for building AI agents in 2026
[youtube] I Built an AI Agent That Runs My Business
[youtube] AI Agents Explained - Complete 2026 Guide

Related Tutorials

  • How to Get Google News Results via the Scavio API
  • How to Get Reddit Data Without the Official Reddit API

Frequently Asked Questions

Most developers complete this tutorial in 15 to 30 minutes. You will need a Scavio API key (free tier works) and a working Python or JavaScript environment.

Python 3.10 or higher. requests library installed. A Scavio API key. Topics or keywords to aggregate news for. A Scavio API key gives you 250 free credits per month.

Yes. The free tier includes 250 credits per month, which is more than enough to complete this tutorial and prototype a working solution.

Scavio has a native LangChain package (langchain-scavio), an MCP server, and a plain REST API that works with any HTTP client. This tutorial uses the raw REST API, but you can adapt to your framework of choice.

Related Resources

Use Case

AI Trading Multi-Source Data Aggregation

Read more
Best Of

Best Google Maps Business Data APIs (May 2026)

Read more
Use Case

Multi-Source Data Aggregation via Single API

Read more
Best Of

Best Reddit APIs for Stock Sentiment Data in 2026

Read more
Solution

Find YouTube Influencers via API Instead of Scraping

Read more
Glossary

Google Maps Places API Cost

Read more

Start Building

Build a news aggregator that pulls articles from Google News, Reddit, and YouTube using the Scavio API. Deduplicate and rank by relevance across sources.

Get Free API KeyRead the Docs
ScavioScavio

Real-time search API for AI agents. Search every platform, not just Google.

Product

  • Features
  • Pricing
  • Dashboard
  • Affiliates

Developers

  • Documentation
  • API Reference
  • Quickstart
  • MCP Integration
  • Python SDK

Alternatives

  • Tavily Alternative
  • SerpAPI Alternative
  • Firecrawl Alternative
  • Exa Alternative

Tools

  • JSON Formatter
  • cURL to Code
  • Token Counter
  • All Tools

© 2026 Scavio. All rights reserved.

Featured on TAAFT
Terms of ServicePrivacy Policy