Amazon product reviews are a rich source of customer sentiment data, product improvement signals, and competitive intelligence. Scraping Amazon reviews directly is unreliable due to aggressive bot detection and CAPTCHA challenges. The Scavio API provides a reviews endpoint that returns structured review data for any ASIN, including star rating, review title, review body, reviewer name, date, and verified purchase status. This tutorial demonstrates how to fetch reviews, filter by rating, and prepare data for sentiment analysis.
Prerequisites
- Python 3.8 or higher
- requests library installed
- A Scavio API key
- An Amazon ASIN to fetch reviews for
Walkthrough
Step 1: Fetch reviews for a product ASIN
POST to the Scavio Amazon endpoint with the action reviews and your ASIN. The response includes a reviews array.
def get_reviews(asin: str) -> list[dict]:
response = requests.post(
"https://api.scavio.dev/api/v1/search",
headers={"x-api-key": API_KEY},
json={"platform": "amazon", "query": asin, "marketplace": "US"}
)
response.raise_for_status()
return response.json().get("reviews", [])Step 2: Filter by star rating
Segment reviews by rating to analyze positive and negative feedback separately.
def filter_by_stars(reviews: list[dict], stars: int) -> list[dict]:
return [r for r in reviews if r.get("rating") == stars]
negative = filter_by_stars(reviews, 1) + filter_by_stars(reviews, 2)
positive = filter_by_stars(reviews, 4) + filter_by_stars(reviews, 5)Step 3: Extract review text for NLP
Build a list of review bodies for input to a sentiment analysis model or topic extraction pipeline.
texts = [r["body"] for r in reviews if r.get("body")]
print(f"Collected {len(texts)} review texts for NLP")
print(texts[0][:200])Step 4: Compute rating distribution
Count reviews by star rating to understand the overall sentiment distribution for the product.
from collections import Counter
distribution = Counter(r.get("rating") for r in reviews)
for stars in sorted(distribution):
print(f"{stars} star: {distribution[stars]} reviews")Python Example
import os
from collections import Counter
import requests
API_KEY = os.environ.get("SCAVIO_API_KEY", "your_scavio_api_key")
ENDPOINT = "https://api.scavio.dev/api/v1/search"
def get_reviews(asin: str) -> list[dict]:
r = requests.post(ENDPOINT, headers={"x-api-key": API_KEY},
json={"platform": "amazon", "query": asin, "marketplace": "US"})
r.raise_for_status()
return r.json().get("reviews", [])
def summarize(reviews: list[dict]) -> None:
dist = Counter(r.get("rating") for r in reviews)
for stars in sorted(dist, reverse=True):
print(f"{stars}*: {dist[stars]} reviews")
texts = [r["body"] for r in reviews if r.get("body")]
print(f"\n{len(texts)} reviews with text available for NLP")
if __name__ == "__main__":
reviews = get_reviews("B09G9FPHY6")
summarize(reviews)JavaScript Example
const API_KEY = process.env.SCAVIO_API_KEY || "your_scavio_api_key";
const ENDPOINT = "https://api.scavio.dev/api/v1/search";
async function getReviews(asin) {
const res = await fetch(ENDPOINT, {
method: "POST",
headers: { "x-api-key": API_KEY, "Content-Type": "application/json" },
body: JSON.stringify({ platform: "amazon", query: asin, marketplace: "US" })
});
const data = await res.json();
return data.reviews || [];
}
async function main() {
const reviews = await getReviews("B09G9FPHY6");
const dist = reviews.reduce((acc, r) => {
acc[r.rating] = (acc[r.rating] || 0) + 1; return acc;
}, {});
Object.entries(dist).sort().reverse().forEach(([s, c]) => console.log(`${s}*: ${c}`));
}
main().catch(console.error);Expected Output
{
"reviews": [
{
"title": "Great sound quality",
"body": "The bass response is excellent and the ANC works well...",
"rating": 5,
"reviewer": "John D.",
"date": "2026-02-14",
"verified_purchase": true
},
{
"title": "Good but battery life could be better",
"body": "I love the comfort and sound but 20 hours isn't enough...",
"rating": 3,
"reviewer": "Sarah M.",
"date": "2026-01-28",
"verified_purchase": true
}
]
}