AI content without live data is slop. Articles about 'best APIs' with fabricated pricing and product comparisons with imaginary features fail because the LLM invented details. This tutorial builds a content pipeline that fetches real data before generating text: current prices from search results, user opinions from Reddit, and product details from Amazon. The output passes fact-checking because facts came from live sources.
Prerequisites
- Python 3.10+
- requests library installed
- A Scavio API key from scavio.dev
- An OpenAI API key for content generation
Walkthrough
Step 1: Build the multi-source data fetcher
Pull data from Google, Reddit, and Amazon via Scavio.
import os, requests, json
SK = os.environ['SCAVIO_API_KEY']
OK = os.environ['OPENAI_API_KEY']
SH = {'x-api-key': SK, 'Content-Type': 'application/json'}
def fetch_google(q):
data = requests.post('https://api.scavio.dev/api/v1/search',
headers=SH, json={'query': q, 'country_code': 'us'}).json()
return [{'title': r['title'], 'snippet': r.get('snippet', ''), 'url': r['link']}
for r in data.get('organic_results', [])[:5]]
def fetch_reddit(q):
data = requests.post('https://api.scavio.dev/api/v1/search',
headers=SH, json={'query': q, 'platform': 'reddit', 'country_code': 'us'}).json()
return [{'title': r['title'], 'snippet': r.get('snippet', '')}
for r in data.get('organic_results', [])[:5]]
def fetch_products(q):
data = requests.post('https://api.scavio.dev/api/v1/search',
headers=SH, json={'query': q, 'platform': 'amazon', 'marketplace': 'US'}).json()
return [{'title': p.get('title', ''), 'price': p.get('price', 'N/A'), 'rating': p.get('rating', '')}
for p in data.get('products', [])[:5]]Step 2: Assemble a research brief
Compile data from all sources into a structured brief for the LLM.
def research(topic, product_query=None):
brief = f'Topic: {topic}\n\n=== Google ===\n'
g = fetch_google(topic)
brief += '\n'.join(f"- {r['title']}: {r['snippet']}" for r in g)
brief += '\n\n=== Reddit ===\n'
r = fetch_reddit(topic)
brief += '\n'.join(f"- {d['title']}: {d['snippet']}" for d in r)
credits = 2
if product_query:
p = fetch_products(product_query)
brief += '\n\n=== Amazon Products ===\n'
brief += '\n'.join(f"- {x['title']}: {x['price']} ({x['rating']})" for x in p)
credits += 1
print(f'Research cost: ${credits * 0.005:.3f}')
return briefStep 3: Generate grounded content
Pass the brief to the LLM with strict instructions to only use provided data.
def generate(topic, brief):
resp = requests.post('https://api.openai.com/v1/chat/completions',
headers={'Authorization': f'Bearer {OK}', 'Content-Type': 'application/json'},
json={'model': 'gpt-4o', 'temperature': 0.3, 'messages': [
{'role': 'system', 'content': 'Write based ONLY on the research brief. No fabricated stats. '
'If data is missing, say so. Cite Reddit as "users report". Start with a direct answer.'},
{'role': 'user', 'content': f'Write 600 words about: {topic}\n\n{brief}'}]})
return resp.json()['choices'][0]['message']['content']
brief = research('best noise canceling headphones 2026', 'noise canceling headphones')
article = generate('best noise canceling headphones 2026', brief)
print(article[:300])Step 4: Validate prices in generated content
Check that dollar amounts in the article appear in source data.
import re
def validate(content, brief):
source = brief.lower()
prices = re.findall(r'\$[\d,.]+', content)
issues = [p for p in prices if p.lower() not in source]
if issues:
print(f'WARNING: {len(issues)} unverified prices: {issues}')
else:
print('All prices verified against source data.')
return issues
validate(article, brief)Python Example
import os, requests
SK = os.environ['SCAVIO_API_KEY']
SH = {'x-api-key': SK, 'Content-Type': 'application/json'}
def research(topic):
g = requests.post('https://api.scavio.dev/api/v1/search',
headers=SH, json={'query': topic, 'country_code': 'us'}).json().get('organic_results', [])[:3]
r = requests.post('https://api.scavio.dev/api/v1/search',
headers=SH, json={'query': topic, 'platform': 'reddit', 'country_code': 'us'}).json().get('organic_results', [])[:3]
print(f'{len(g)} Google + {len(r)} Reddit results. Cost: $0.010')
return g, r
research('best serp api 2026')JavaScript Example
const SK = process.env.SCAVIO_API_KEY;
const SH = { 'x-api-key': SK, 'Content-Type': 'application/json' };
async function research(topic) {
const [g, r] = await Promise.all([
fetch('https://api.scavio.dev/api/v1/search', {
method: 'POST', headers: SH,
body: JSON.stringify({ query: topic, country_code: 'us' })
}).then(r => r.json()),
fetch('https://api.scavio.dev/api/v1/search', {
method: 'POST', headers: SH,
body: JSON.stringify({ query: topic, platform: 'reddit', country_code: 'us' })
}).then(r => r.json()),
]);
console.log(`${(g.organic_results||[]).length}G + ${(r.organic_results||[]).length}R. Cost: $0.010`);
}
research('best serp api 2026').catch(console.error);Expected Output
Research cost: $0.015
The Sony WH-1000XM5 remains the top noise canceling headphone
in 2026, priced at $298 on Amazon with a 4.6 rating. Users on
Reddit report the XM5 noise cancellation outperforms Bose QC
Ultra in airplane environments...
All prices verified against source data.