Tutorial

How to Build a Reddit Market Research Scanner

Scan Reddit for product mentions, pain points, and feature requests across subreddits. Python pipeline at $0.005/query for market research.

Reddit threads contain unfiltered product feedback, feature requests, and competitive complaints that no survey can replicate. This scanner searches multiple subreddits for market signals, classifies them by type (pain point, feature request, competitor mention, purchase intent), and outputs a prioritized research report. Each search costs $0.005 via Scavio Reddit endpoint.

Prerequisites

  • Python 3.8+
  • requests library
  • A Scavio API key from scavio.dev
  • Target product category or market to research

Walkthrough

Step 1: Configure the market research scanner

Set up search queries targeting different market signal types.

Python
import os, requests
from collections import defaultdict

API_KEY = os.environ['SCAVIO_API_KEY']
SH = {'x-api-key': API_KEY, 'Content-Type': 'application/json'}

def market_queries(product):
    return [
        f'{product} alternative to',
        f'{product} looking for recommendation',
        f'{product} vs',
        f'{product} problem with',
        f'{product} wish feature',
        f'{product} switched from',
    ]

PRODUCT = 'serp api'
queries = market_queries(PRODUCT)
print(f'Market research for "{PRODUCT}": {len(queries)} signal queries')
print(f'Estimated cost: ${len(queries) * 0.005:.3f}')

Step 2: Scan Reddit for market signals

Execute searches and extract structured market signals from results.

Python
SIGNAL_TYPES = {
    'alternative to': 'switching_intent',
    'looking for': 'purchase_intent',
    'recommendation': 'purchase_intent',
    'vs': 'comparison',
    'problem with': 'pain_point',
    'wish feature': 'feature_request',
    'switched from': 'churn_signal'
}

def scan_signals(product):
    signals = defaultdict(list)
    for query in market_queries(product):
        data = requests.post('https://api.scavio.dev/api/v1/search',
            headers=SH, json={'query': query, 'platform': 'reddit', 'country_code': 'us'}).json()
        signal_type = next((v for k, v in SIGNAL_TYPES.items() if k in query), 'other')
        for r in data.get('organic_results', [])[:5]:
            signals[signal_type].append({
                'title': r.get('title', '')[:80],
                'snippet': r.get('snippet', '')[:150],
                'link': r.get('link', '')
            })
    return dict(signals)

signals = scan_signals(PRODUCT)
for stype, items in signals.items():
    print(f'\n{stype}: {len(items)} signals')
    for item in items[:2]:
        print(f'  - {item["title"][:60]}')

Step 3: Extract competitors and pain points

Parse signals to identify mentioned competitors and recurring pain points.

Python
def extract_competitors(signals):
    competitors = defaultdict(int)
    for items in signals.values():
        for item in items:
            text = f"{item['title']} {item['snippet']}".lower()
            known = ['serpapi', 'dataforseo', 'serper', 'scrapingbee', 'brightdata', 'apify', 'tavily', 'exa']
            for comp in known:
                if comp in text:
                    competitors[comp] += 1
    return dict(sorted(competitors.items(), key=lambda x: -x[1]))

def extract_pain_points(signals):
    pain_keywords = ['slow', 'expensive', 'unreliable', 'broken', 'complex', 'limited',
                     'missing', 'annoying', 'frustrating', 'confusing']
    pains = defaultdict(int)
    for items in signals.get('pain_point', []) + signals.get('churn_signal', []):
        text = f"{items['title']} {items['snippet']}".lower()
        for kw in pain_keywords:
            if kw in text:
                pains[kw] += 1
    return dict(sorted(pains.items(), key=lambda x: -x[1]))

comps = extract_competitors(signals)
pains = extract_pain_points(signals)
print(f'\nCompetitors mentioned: {comps}')
print(f'Pain points: {pains}')

Step 4: Generate market research report

Combine all signals into a structured market research report.

Python
def market_report(product):
    signals = scan_signals(product)
    competitors = extract_competitors(signals)
    pains = extract_pain_points(signals)
    cost = len(market_queries(product)) * 0.005
    print(f'\n=== Market Research Report: {product} ===')
    print(f'\nSignal summary:')
    for stype, items in signals.items():
        print(f'  {stype:20}: {len(items)} signals')
    print(f'\nTop competitors mentioned:')
    for comp, count in list(competitors.items())[:5]:
        print(f'  {comp:20}: {count} mentions')
    print(f'\nTop pain points:')
    for pain, count in list(pains.items())[:5]:
        print(f'  {pain:20}: {count} mentions')
    # High-intent signals
    purchase = signals.get('purchase_intent', [])
    print(f'\nHigh-intent threads ({len(purchase)}):')
    for p in purchase[:3]:
        print(f'  - {p["title"][:60]}')
    print(f'\nCost: ${cost:.3f}')

market_report('serp api')

Python Example

Python
import os, requests
from collections import defaultdict
SH = {'x-api-key': os.environ['SCAVIO_API_KEY'], 'Content-Type': 'application/json'}

def scan(product):
    signals = defaultdict(list)
    for q in [f'{product} alternative', f'{product} vs', f'{product} problem']:
        data = requests.post('https://api.scavio.dev/api/v1/search',
            headers=SH, json={'query': q, 'platform': 'reddit', 'country_code': 'us'}).json()
        for r in data.get('organic_results', [])[:3]:
            signals[q.split()[-1]].append(r.get('title', '')[:60])
    for stype, items in signals.items():
        print(f'{stype}: {len(items)} signals')
        for i in items[:2]: print(f'  - {i}')

scan('serp api')

JavaScript Example

JavaScript
const SH = { 'x-api-key': process.env.SCAVIO_API_KEY, 'Content-Type': 'application/json' };
async function scan(product) {
  for (const suffix of ['alternative', 'vs', 'problem']) {
    const data = await fetch('https://api.scavio.dev/api/v1/search', {
      method: 'POST', headers: SH,
      body: JSON.stringify({ query: `${product} ${suffix}`, platform: 'reddit', country_code: 'us' })
    }).then(r => r.json());
    console.log(`${suffix}: ${(data.organic_results || []).length} results`);
    (data.organic_results || []).slice(0, 2).forEach(r => console.log(`  - ${r.title.slice(0, 60)}`));
  }
}
scan('serp api').catch(console.error);

Expected Output

JSON
Market research for "serp api": 6 signal queries
Estimated cost: $0.030

switching_intent: 8 signals
  - Looking for SerpAPI alternative, too expensive for startup
  - Switched from SerpAPI to something cheaper
purchase_intent: 6 signals
comparison: 7 signals
pain_point: 5 signals

Top competitors mentioned:
  serpapi             : 8 mentions
  dataforseo          : 5 mentions
  serper              : 3 mentions

Cost: $0.030

Related Tutorials

Frequently Asked Questions

Most developers complete this tutorial in 15 to 30 minutes. You will need a Scavio API key (free tier works) and a working Python or JavaScript environment.

Python 3.8+. requests library. A Scavio API key from scavio.dev. Target product category or market to research. A Scavio API key gives you 250 free credits per month.

Yes. The free tier includes 250 credits per month, which is more than enough to complete this tutorial and prototype a working solution.

Scavio has a native LangChain package (langchain-scavio), an MCP server, and a plain REST API that works with any HTTP client. This tutorial uses the raw REST API, but you can adapt to your framework of choice.

Start Building

Scan Reddit for product mentions, pain points, and feature requests across subreddits. Python pipeline at $0.005/query for market research.