YouTube Comment Sentiment Without Scraping
YouTube blocks scrapers aggressively. API approach returns comments as structured JSON. Python sentiment analysis pipeline included.
YouTube aggressively blocks comment scrapers in 2026, and the official Data API caps free usage at roughly 10,000 units/day (about 100 comment threads). A search API that returns YouTube comments as structured JSON bypasses both problems: no scraping infrastructure to maintain, no OAuth setup, and no quota management. Here is how to build a sentiment pipeline on top of it.
Why scraping YouTube comments fails at scale
YouTube renders comments via JavaScript with anti-bot protections. Headless browsers get rate-limited after a few hundred requests. Residential proxies add $50-200/month in cost. The official API requires OAuth consent, has daily quotas, and returns nested pagination that is painful to handle. TubeMine proved you can build an extractor in 2.5 hours, but maintaining it against YouTube changes is the real cost.
The API approach: structured comments as JSON
import requests, os
def get_video_comments(video_url: str, num_results: int = 50) -> list:
"""Fetch YouTube comments as structured JSON via search API."""
resp = requests.post(
"https://api.scavio.dev/api/v1/search",
headers={"x-api-key": os.environ["SCAVIO_API_KEY"]},
json={
"query": video_url,
"platform": "youtube",
"num_results": num_results,
},
timeout=15,
)
data = resp.json()
return data.get("comments", [])
# Each comment object: {author, text, likes, published_at, reply_count}
comments = get_video_comments("https://youtube.com/watch?v=example123")
print(f"Fetched {len(comments)} comments")Building the sentiment pipeline
Once you have comments as structured data, sentiment analysis is straightforward. Use a lightweight model (TextBlob for speed, or a small transformer for accuracy) to classify each comment. Aggregate by video, channel, or time period.
from textblob import TextBlob
from collections import Counter
from datetime import datetime
def analyze_sentiment(comments: list) -> dict:
"""Classify comments into positive/negative/neutral with scores."""
results = {"positive": [], "negative": [], "neutral": []}
scores = []
for comment in comments:
text = comment.get("text", "")
if not text:
continue
blob = TextBlob(text)
polarity = blob.sentiment.polarity
scores.append(polarity)
entry = {
"text": text,
"author": comment.get("author", ""),
"likes": comment.get("likes", 0),
"polarity": polarity,
}
if polarity > 0.1:
results["positive"].append(entry)
elif polarity < -0.1:
results["negative"].append(entry)
else:
results["neutral"].append(entry)
total = len(scores)
return {
"total_comments": total,
"positive_pct": len(results["positive"]) / total * 100 if total else 0,
"negative_pct": len(results["negative"]) / total * 100 if total else 0,
"neutral_pct": len(results["neutral"]) / total * 100 if total else 0,
"avg_polarity": sum(scores) / total if total else 0,
"most_negative": sorted(results["negative"], key=lambda x: x["polarity"])[:5],
"most_positive": sorted(results["positive"], key=lambda x: x["polarity"], reverse=True)[:5],
}
comments = get_video_comments("https://youtube.com/watch?v=example123")
sentiment = analyze_sentiment(comments)
print(f"Sentiment: {sentiment['positive_pct']:.0f}% positive, "
f"{sentiment['negative_pct']:.0f}% negative")Scaling to channel-level monitoring
def channel_sentiment_monitor(channel_videos: list[str]) -> dict:
"""Monitor sentiment across an entire channel's recent videos."""
channel_data = {}
for video_url in channel_videos:
comments = get_video_comments(video_url, num_results=30)
sentiment = analyze_sentiment(comments)
channel_data[video_url] = sentiment
# Aggregate channel-level metrics
all_polarities = [v["avg_polarity"] for v in channel_data.values()]
return {
"videos_analyzed": len(channel_videos),
"channel_avg_polarity": sum(all_polarities) / len(all_polarities),
"most_controversial": min(channel_data.items(), key=lambda x: x[1]["avg_polarity"]),
"most_loved": max(channel_data.items(), key=lambda x: x[1]["avg_polarity"]),
}
# Monitor 20 videos: 20 queries * $0.005 = $0.10
channel_report = channel_sentiment_monitor([
"https://youtube.com/watch?v=video1",
"https://youtube.com/watch?v=video2",
# ... 20 videos
])Use cases for YouTube comment sentiment
- Brand monitoring: track sentiment on videos mentioning your product
- Content strategy: identify which video topics generate positive engagement
- Competitor analysis: compare audience sentiment across competing channels
- Crisis detection: alert when negative sentiment spikes on your channel
- Product feedback: extract feature requests and complaints from tutorial comments
Cost breakdown
At $0.005 per API request, monitoring 50 videos daily with 30 comments each costs $0.25/day or $7.50/month. Compare this to maintaining a scraping infrastructure with proxies ($50-200/month) or paying for a YouTube analytics tool ($30-100/month). The API approach is cheaper and requires zero maintenance when YouTube changes their frontend.
The pattern is simple: fetch structured comment data via API, run lightweight NLP for sentiment classification, store results in a time-series database, and alert on anomalies. No scraping, no proxies, no quota management.