Ecommerce Data API Quality Testing Guide
Test ecommerce API quality across five dimensions: freshness, coverage, accuracy, completeness, and latency. Automated test suite included.
Testing ecommerce data API quality requires checking five dimensions: result freshness (are prices current), coverage (does it return all major retailers), accuracy (do prices match the actual product page), structured completeness (are ratings, review counts, and availability included), and latency under load. Most teams skip this testing and discover data quality issues in production when customers complain.
The five quality dimensions
- Freshness: prices should be no more than 24 hours old
- Coverage: results from Amazon, Walmart, Target, Best Buy minimum
- Accuracy: price in API matches price on product page
- Completeness: title, price, rating, reviews, availability, image URL
- Latency: under 2 seconds for 95th percentile
Automated quality test suite
import os, requests, time, json
SCAVIO_KEY = os.environ["SCAVIO_API_KEY"]
HEADERS = {"x-api-key": SCAVIO_KEY}
def test_shopping_quality(query: str) -> dict:
"""Run quality checks on shopping search results."""
start = time.time()
resp = requests.post(
"https://api.scavio.dev/api/v1/search",
headers=HEADERS,
json={"query": query, "search_type": "shopping", "num_results": 20},
)
latency = time.time() - start
data = resp.json()
results = data.get("shopping_results", [])
# Coverage: count unique retailers
retailers = set()
for r in results:
domain = r.get("source", "").lower()
retailers.add(domain)
# Completeness: check required fields
complete = 0
required_fields = ["title", "price", "link", "source"]
for r in results:
if all(r.get(f) for f in required_fields):
complete += 1
# Price sanity: flag outliers (< $1 or > 10x median)
prices = []
for r in results:
price_str = str(r.get("price", "")).replace("$", "").replace(",", "")
try:
prices.append(float(price_str))
except ValueError:
pass
median_price = sorted(prices)[len(prices) // 2] if prices else 0
outliers = [p for p in prices if p < 1 or (median_price and p > 10 * median_price)]
return {
"query": query,
"result_count": len(results),
"unique_retailers": len(retailers),
"retailers": list(retailers),
"completeness_rate": complete / len(results) if results else 0,
"price_outliers": len(outliers),
"latency_seconds": round(latency, 3),
"pass": (
len(results) >= 5
and len(retailers) >= 3
and complete / max(len(results), 1) >= 0.8
and latency < 2.0
),
}
# Run against test queries
test_queries = [
"wireless earbuds",
"running shoes men",
"mechanical keyboard",
"protein powder",
]
for q in test_queries:
result = test_shopping_quality(q)
status = "PASS" if result["pass"] else "FAIL"
print(f"[{status}] {q}: {result['result_count']} results, "
f"{result['unique_retailers']} retailers, "
f"{result['latency_seconds']}s")Price accuracy verification
import requests
def verify_price_accuracy(api_result: dict, sample_size: int = 3) -> dict:
"""Spot-check API prices against actual product pages."""
results = api_result.get("shopping_results", [])[:sample_size]
checks = []
for r in results:
# Use search to find the current price on the retailer site
verification = requests.post(
"https://api.scavio.dev/api/v1/search",
headers={"x-api-key": os.environ["SCAVIO_API_KEY"]},
json={"query": f"{r['title']} price site:{r.get('source', '')}",
"num_results": 1},
).json()
checks.append({
"product": r["title"][:50],
"api_price": r.get("price"),
"source": r.get("source"),
"verification_snippet": verification.get("organic_results", [{}])[0].get("snippet", ""),
})
return {"checks": checks, "sample_size": sample_size}Continuous monitoring setup
Run quality tests daily via cron. Track metrics over time in a simple JSON log. Alert on: completeness rate dropping below 80%, latency exceeding 3 seconds, result count dropping below 5 for any test query. This costs about 20 API calls per day (4 test queries x 5 quality checks) = $0.10/day with Scavio.
Comparing ecommerce API providers
- Keepa (Amazon only): $19/mo for 100K tokens, deep price history
- DataForSEO Shopping: $0.002/query live, multi-retailer
- Scavio Shopping: $0.005/credit, Google Shopping results
- SerpAPI Shopping: $0.015/search, Google Shopping only
- Helium 10 (Amazon FBA): $39/mo Starter, proprietary metrics
Key takeaway
Test your ecommerce data API before building features on it. A 20-line quality test suite catches data issues before they reach production. Run it daily, track trends, and switch providers when quality drops below your threshold.