Tutorial

How to Scrape Google News with Python and Scavio

Two methods to scrape Google News in 2026 without proxies or a headless browser. Typed JSON results in seconds via Scavio's news endpoint.

Scraping Google News in 2026 is a sandbox of broken libraries, captcha walls, and rotating proxy bills. The faster path is a SERP API with `search_type: news` that returns the same articles as typed JSON. This tutorial shows two methods: direct API call for daily news pulls, and a scheduled aggregator for a brand or topic monitor.

Prerequisites

  • Python 3.10+
  • A Scavio API key

Walkthrough

Step 1: Pull Google News for a query

One Scavio call returns title, snippet, source, and date.

Python
import requests, os
API_KEY = os.environ['SCAVIO_API_KEY']

def news(query):
    r = requests.post('https://api.scavio.dev/api/v1/google',
        headers={'x-api-key': API_KEY},
        json={'query': query, 'search_type': 'news', 'num_results': 20})
    return r.json().get('news_results', [])

Step 2: Filter by date range

Pass `time_range` to scope to recent days.

Python
def fresh_news(query, days=1):
    r = requests.post('https://api.scavio.dev/api/v1/google',
        headers={'x-api-key': API_KEY},
        json={'query': query, 'search_type': 'news', 'time_range': f'd{days}'})
    return r.json().get('news_results', [])

Step 3: Build a brand monitor

Schedule a daily run and dedupe by URL.

Python
import json, pathlib
seen = pathlib.Path('seen.json')
known = json.loads(seen.read_text() or '[]')

def monitor(brand):
    for item in fresh_news(brand, days=1):
        if item['link'] not in known:
            known.append(item['link'])
            print(f"NEW: {item['title']} - {item['source']}")
    seen.write_text(json.dumps(known))

Step 4: Add Reddit cross-check

Validate news with a Reddit thread search.

Python
def cross_check(brand, headline):
    r = requests.post('https://api.scavio.dev/api/v1/reddit/search',
        headers={'x-api-key': API_KEY},
        json={'query': f'{brand} {headline[:50]}'})
    return r.json().get('posts', [])[:5]

Step 5: Email the digest

Use any SMTP library to send a markdown digest.

Python
# Email body composed from `monitor` output;
# omitted SMTP wiring for brevity.

Python Example

Python
import os, requests
API_KEY = os.environ['SCAVIO_API_KEY']

def google_news(query):
    r = requests.post('https://api.scavio.dev/api/v1/google',
        headers={'x-api-key': API_KEY},
        json={'query': query, 'search_type': 'news', 'num_results': 20})
    return r.json().get('news_results', [])

for a in google_news('openai gpt-5.5 launch'):
    print(a['title'], '-', a['source'])

JavaScript Example

JavaScript
const API_KEY = process.env.SCAVIO_API_KEY;
export async function googleNews(query) {
  const r = await fetch('https://api.scavio.dev/api/v1/google', {
    method: 'POST',
    headers: { 'x-api-key': API_KEY, 'Content-Type': 'application/json' },
    body: JSON.stringify({ query, search_type: 'news', num_results: 20 })
  });
  return (await r.json()).news_results || [];
}

Expected Output

JSON
Typed JSON list of news articles with title, snippet, source, date, and URL. Daily monitor emits only new articles since the last run.

Related Tutorials

Frequently Asked Questions

Most developers complete this tutorial in 15 to 30 minutes. You will need a Scavio API key (free tier works) and a working Python or JavaScript environment.

Python 3.10+. A Scavio API key. A Scavio API key gives you 500 free credits per month.

Yes. The free tier includes 500 credits per month, which is more than enough to complete this tutorial and prototype a working solution.

Scavio has a native LangChain package (langchain-scavio), an MCP server, and a plain REST API that works with any HTTP client. This tutorial uses the raw REST API, but you can adapt to your framework of choice.

Start Building

Two methods to scrape Google News in 2026 without proxies or a headless browser. Typed JSON results in seconds via Scavio's news endpoint.