Tutorial

How to Classify Reddit Post Intent for Leads

Classify Reddit posts by purchase intent: research, comparison, ready-to-buy, and complaint. SERP-based pipeline for lead scoring.

Not all Reddit mentions are equal. A post asking 'what is the best X' signals research intent, while 'looking for alternative to Y' signals ready-to-switch intent. This classifier scores Reddit posts by purchase intent using keyword patterns and context, helping you prioritize which discussions to engage with.

Prerequisites

  • Python 3.8+
  • requests library
  • A Scavio API key from scavio.dev
  • Target product category or niche

Walkthrough

Step 1: Fetch Reddit posts via SERP

Search Reddit for posts in your product category.

Python
import os, requests, json
from collections import Counter

API_KEY = os.environ['SCAVIO_API_KEY']
SH = {'x-api-key': API_KEY, 'Content-Type': 'application/json'}

CATEGORY = 'search api'
QUERIES = [
    f'{CATEGORY} recommendation site:reddit.com',
    f'best {CATEGORY} site:reddit.com',
    f'alternative to {CATEGORY} site:reddit.com',
    f'{CATEGORY} pricing site:reddit.com',
    f'{CATEGORY} not working site:reddit.com',
]

def fetch_reddit_posts(query):
    data = requests.post('https://api.scavio.dev/api/v1/search',
        headers=SH, json={'query': query, 'country_code': 'us'}).json()
    return [{'title': r.get('title', ''), 'snippet': r.get('snippet', ''),
             'link': r.get('link', '')} for r in data.get('organic_results', [])]

all_posts = []
for q in QUERIES:
    posts = fetch_reddit_posts(q)
    all_posts.extend(posts)
print(f'Fetched {len(all_posts)} Reddit posts. Cost: ${len(QUERIES) * 0.005:.3f}')

Step 2: Classify posts by purchase intent

Score each post for research, comparison, ready-to-buy, or complaint intent.

Python
INTENT_PATTERNS = {
    'ready_to_switch': {
        'keywords': ['alternative to', 'switching from', 'replacing', 'migrate from', 'looking for replacement'],
        'score': 90,
        'label': 'READY TO SWITCH'
    },
    'comparison': {
        'keywords': ['vs', 'versus', 'compared to', 'better than', 'which is better', 'comparison'],
        'score': 70,
        'label': 'COMPARING'
    },
    'research': {
        'keywords': ['best', 'recommend', 'suggestion', 'what do you use', 'looking for', 'need a'],
        'score': 50,
        'label': 'RESEARCHING'
    },
    'complaint': {
        'keywords': ['not working', 'broken', 'issues with', 'problems', 'frustrated', 'terrible'],
        'score': 80,
        'label': 'FRUSTRATED (competitor)'
    },
    'pricing': {
        'keywords': ['pricing', 'cost', 'expensive', 'cheap', 'free tier', 'budget'],
        'score': 60,
        'label': 'PRICE SENSITIVE'
    }
}

def classify_intent(post):
    text = f'{post["title"]} {post["snippet"]}'.lower()
    best_intent = None
    best_score = 0
    for intent, config in INTENT_PATTERNS.items():
        matches = sum(1 for kw in config['keywords'] if kw in text)
        if matches > 0 and config['score'] > best_score:
            best_intent = intent
            best_score = config['score']
    return best_intent or 'general', best_score

for post in all_posts:
    intent, score = classify_intent(post)
    post['intent'] = intent
    post['intent_score'] = score

intents = Counter(p['intent'] for p in all_posts)
print(f'\nIntent Distribution:')
for intent, count in intents.most_common():
    label = INTENT_PATTERNS.get(intent, {}).get('label', 'GENERAL')
    print(f'  {label:25} | {count:3} posts')

Step 3: Prioritize leads by intent score

Rank posts by lead quality and output a prioritized engagement list.

Python
def lead_priority_report(posts):
    sorted_posts = sorted(posts, key=lambda p: p.get('intent_score', 0), reverse=True)
    print(f'\n=== Lead Priority Report ===')
    print(f'  Total posts: {len(sorted_posts)}')
    # Tier 1: High intent (score >= 80)
    tier1 = [p for p in sorted_posts if p.get('intent_score', 0) >= 80]
    print(f'\n  TIER 1 - High Intent ({len(tier1)} posts, engage immediately):')
    for p in tier1[:5]:
        label = INTENT_PATTERNS.get(p['intent'], {}).get('label', 'UNKNOWN')
        print(f'    [{label:20}] {p["title"][:50]}')
        print(f'      {p["link"][:60]}')
    # Tier 2: Medium intent (50-79)
    tier2 = [p for p in sorted_posts if 50 <= p.get('intent_score', 0) < 80]
    print(f'\n  TIER 2 - Medium Intent ({len(tier2)} posts, monitor):')
    for p in tier2[:5]:
        label = INTENT_PATTERNS.get(p['intent'], {}).get('label', 'UNKNOWN')
        print(f'    [{label:20}] {p["title"][:50]}')
    # Tier 3: Low intent
    tier3 = [p for p in sorted_posts if p.get('intent_score', 0) < 50]
    print(f'\n  TIER 3 - Low Intent ({len(tier3)} posts, skip or batch)')
    print(f'\n  Cost: ${len(QUERIES) * 0.005:.3f}')

lead_priority_report(all_posts)

Python Example

Python
import os, requests
SH = {'x-api-key': os.environ['SCAVIO_API_KEY'], 'Content-Type': 'application/json'}

def classify_reddit(query):
    data = requests.post('https://api.scavio.dev/api/v1/search',
        headers=SH, json={'query': f'{query} site:reddit.com', 'country_code': 'us'}).json()
    for r in data.get('organic_results', [])[:3]:
        title = r.get('title', '').lower()
        intent = 'SWITCH' if 'alternative' in title else 'RESEARCH' if 'best' in title else 'GENERAL'
        print(f'  [{intent}] {r.get("title", "")[:50]}')

classify_reddit('search api recommendation')
print('Cost: $0.005')

JavaScript Example

JavaScript
const SH = { 'x-api-key': process.env.SCAVIO_API_KEY, 'Content-Type': 'application/json' };
const data = await fetch('https://api.scavio.dev/api/v1/search', {
  method: 'POST', headers: SH,
  body: JSON.stringify({ query: 'search api recommendation site:reddit.com', country_code: 'us' })
}).then(r => r.json());
(data.organic_results || []).slice(0, 3).forEach(r => {
  const t = r.title.toLowerCase();
  const intent = t.includes('alternative') ? 'SWITCH' : t.includes('best') ? 'RESEARCH' : 'GENERAL';
  console.log(`[${intent}] ${r.title.slice(0, 50)}`);
});

Expected Output

JSON
Fetched 35 Reddit posts. Cost: $0.025

Intent Distribution:
  RESEARCHING               |  12 posts
  READY TO SWITCH           |   8 posts
  COMPARING                 |   7 posts
  PRICE SENSITIVE            |   4 posts
  FRUSTRATED (competitor)    |   2 posts
  GENERAL                   |   2 posts

=== Lead Priority Report ===
  TIER 1 - High Intent (10 posts, engage immediately):
    [READY TO SWITCH       ] Looking for alternative to SerpAPI, any sugges...
    [FRUSTRATED (competitor)] Tavily not working after Nebius acquisition...

  TIER 2 - Medium Intent (11 posts, monitor):
    [COMPARING              ] Scavio vs Tavily for AI agent search...

Related Tutorials

Frequently Asked Questions

Most developers complete this tutorial in 15 to 30 minutes. You will need a Scavio API key (free tier works) and a working Python or JavaScript environment.

Python 3.8+. requests library. A Scavio API key from scavio.dev. Target product category or niche. A Scavio API key gives you 250 free credits per month.

Yes. The free tier includes 250 credits per month, which is more than enough to complete this tutorial and prototype a working solution.

Scavio has a native LangChain package (langchain-scavio), an MCP server, and a plain REST API that works with any HTTP client. This tutorial uses the raw REST API, but you can adapt to your framework of choice.

Start Building

Classify Reddit posts by purchase intent: research, comparison, ready-to-buy, and complaint. SERP-based pipeline for lead scoring.