youtubesentimentapi

YouTube Comment Sentiment Without Scraping

YouTube blocks scrapers aggressively. API approach returns comments as structured JSON. Python sentiment analysis pipeline included.

May 17, 2026

8 min

YouTube aggressively blocks comment scrapers in 2026, and the official Data API caps free usage at roughly 10,000 units/day (about 100 comment threads). A search API that returns YouTube comments as structured JSON bypasses both problems: no scraping infrastructure to maintain, no OAuth setup, and no quota management. Here is how to build a sentiment pipeline on top of it.

Why scraping YouTube comments fails at scale

YouTube renders comments via JavaScript with anti-bot protections. Headless browsers get rate-limited after a few hundred requests. Residential proxies add $50-200/month in cost. The official API requires OAuth consent, has daily quotas, and returns nested pagination that is painful to handle. TubeMine proved you can build an extractor in 2.5 hours, but maintaining it against YouTube changes is the real cost.

The API approach: structured comments as JSON

Python

import requests, os

def get_video_comments(video_url: str, num_results: int = 50) -> list:
    """Fetch YouTube comments as structured JSON via search API."""
    resp = requests.post(
        "https://api.scavio.dev/api/v1/search",
        headers={"x-api-key": os.environ["SCAVIO_API_KEY"]},
        json={
            "query": video_url,
            "platform": "youtube",
            "num_results": num_results,
        },
        timeout=15,
    )
    data = resp.json()
    return data.get("comments", [])

# Each comment object: {author, text, likes, published_at, reply_count}
comments = get_video_comments("https://youtube.com/watch?v=example123")
print(f"Fetched {len(comments)} comments")

Building the sentiment pipeline

Once you have comments as structured data, sentiment analysis is straightforward. Use a lightweight model (TextBlob for speed, or a small transformer for accuracy) to classify each comment. Aggregate by video, channel, or time period.

Python

from textblob import TextBlob
from collections import Counter
from datetime import datetime

def analyze_sentiment(comments: list) -> dict:
    """Classify comments into positive/negative/neutral with scores."""
    results = {"positive": [], "negative": [], "neutral": []}
    scores = []

    for comment in comments:
        text = comment.get("text", "")
        if not text:
            continue

        blob = TextBlob(text)
        polarity = blob.sentiment.polarity
        scores.append(polarity)

        entry = {
            "text": text,
            "author": comment.get("author", ""),
            "likes": comment.get("likes", 0),
            "polarity": polarity,
        }

        if polarity > 0.1:
            results["positive"].append(entry)
        elif polarity < -0.1:
            results["negative"].append(entry)
        else:
            results["neutral"].append(entry)

    total = len(scores)
    return {
        "total_comments": total,
        "positive_pct": len(results["positive"]) / total * 100 if total else 0,
        "negative_pct": len(results["negative"]) / total * 100 if total else 0,
        "neutral_pct": len(results["neutral"]) / total * 100 if total else 0,
        "avg_polarity": sum(scores) / total if total else 0,
        "most_negative": sorted(results["negative"], key=lambda x: x["polarity"])[:5],
        "most_positive": sorted(results["positive"], key=lambda x: x["polarity"], reverse=True)[:5],
    }

comments = get_video_comments("https://youtube.com/watch?v=example123")
sentiment = analyze_sentiment(comments)
print(f"Sentiment: {sentiment['positive_pct']:.0f}% positive, "
      f"{sentiment['negative_pct']:.0f}% negative")

Scaling to channel-level monitoring

Python

def channel_sentiment_monitor(channel_videos: list[str]) -> dict:
    """Monitor sentiment across an entire channel's recent videos."""
    channel_data = {}

    for video_url in channel_videos:
        comments = get_video_comments(video_url, num_results=30)
        sentiment = analyze_sentiment(comments)
        channel_data[video_url] = sentiment

    # Aggregate channel-level metrics
    all_polarities = [v["avg_polarity"] for v in channel_data.values()]
    return {
        "videos_analyzed": len(channel_videos),
        "channel_avg_polarity": sum(all_polarities) / len(all_polarities),
        "most_controversial": min(channel_data.items(), key=lambda x: x[1]["avg_polarity"]),
        "most_loved": max(channel_data.items(), key=lambda x: x[1]["avg_polarity"]),
    }

# Monitor 20 videos: 20 queries * $0.005 = $0.10
channel_report = channel_sentiment_monitor([
    "https://youtube.com/watch?v=video1",
    "https://youtube.com/watch?v=video2",
    # ... 20 videos
])

Use cases for YouTube comment sentiment

Brand monitoring: track sentiment on videos mentioning your product
Content strategy: identify which video topics generate positive engagement
Competitor analysis: compare audience sentiment across competing channels
Crisis detection: alert when negative sentiment spikes on your channel
Product feedback: extract feature requests and complaints from tutorial comments

Cost breakdown

At $0.005 per API request, monitoring 50 videos daily with 30 comments each costs $0.25/day or $7.50/month. Compare this to maintaining a scraping infrastructure with proxies ($50-200/month) or paying for a YouTube analytics tool ($30-100/month). The API approach is cheaper and requires zero maintenance when YouTube changes their frontend.

The pattern is simple: fetch structured comment data via API, run lightweight NLP for sentiment classification, store results in a time-series database, and alert on anomalies. No scraping, no proxies, no quota management.

YouTube Comment Sentiment Without Scraping

Why scraping YouTube comments fails at scale

The API approach: structured comments as JSON

Building the sentiment pipeline

Scaling to channel-level monitoring

Use cases for YouTube comment sentiment

Cost breakdown

Continue reading

Connect Scavio to Any AI Assistant with MCP

Build a Cross-Platform Product Research Agent with LangGraph