Améliorez la précision de recherche des agents IA en mettant en œuvre l'expansion multi-requêtes, le recoupement des résultats entre sources, le score de confiance basé sur l'accord des sources, et en n'injectant que les données à haute confiance dans le contexte de l'agent. Les agents qui ne cherchent qu'une fois et utilisent le premier résultat produisent fréquemment des recherches inexactes ou incomplètes. Une approche multi-requêtes explore le même sujet sous différents angles, compare les résultats et ne met en avant que les informations qui apparaissent de manière cohérente entre les sources. Cela réduit considérablement les hallucinations dues à la dépendance à une seule source.
Prérequis
- Python 3.8+ installé
- bibliothèque requests installée
- Une clé API Scavio depuis scavio.dev
- Un agent IA qui effectue des tâches de recherche
Parcours
Étape 1: Développer la requête sous plusieurs angles
Générez plusieurs requêtes de recherche sous différents angles pour le même sujet de recherche.
import os, requests, time
API_KEY = os.environ['SCAVIO_API_KEY']
def expand_queries(topic: str) -> list:
"""Generate multiple search angles for a research topic."""
expansions = [
topic,
f'{topic} overview',
f'{topic} comparison 2026',
f'{topic} pros cons',
f'{topic} alternatives',
]
return expansions
def search(query: str) -> list:
resp = requests.post('https://api.scavio.dev/api/v1/search',
headers={'x-api-key': API_KEY},
json={'platform': 'google', 'query': query}, timeout=15)
return resp.json().get('organic_results', [])
queries = expand_queries('vector database')
print(f'Expanded to {len(queries)} queries:')
for q in queries:
print(f' - {q}')Étape 2: Rechercher tous les angles
Exécutez toutes les requêtes développées et collectez les résultats.
def multi_search(topic: str) -> dict:
queries = expand_queries(topic)
all_results = {}
for query in queries:
results = search(query)
all_results[query] = results[:5]
time.sleep(0.3)
total = sum(len(r) for r in all_results.values())
print(f'Searched {len(queries)} queries, got {total} total results')
return all_results
results = multi_search('vector database')
for query, items in results.items():
print(f' {query[:40]}: {len(items)} results')Étape 3: Recouper les sources
Trouvez des informations qui apparaissent de manière cohérente dans plusieurs résultats de recherche.
from collections import Counter
from urllib.parse import urlparse
def cross_reference(all_results: dict) -> list:
domain_counts = Counter()
domain_snippets = {}
for query, results in all_results.items():
for r in results:
domain = urlparse(r.get('link', '')).netloc
if domain:
domain_counts[domain] += 1
if domain not in domain_snippets:
domain_snippets[domain] = {
'title': r.get('title', ''),
'snippet': r.get('snippet', ''),
'url': r.get('link', ''),
}
# Sources appearing in multiple queries are more reliable
reliable = []
for domain, count in domain_counts.most_common():
if count >= 2: # Appeared in at least 2 query results
info = domain_snippets[domain]
info['cross_ref_count'] = count
info['domain'] = domain
reliable.append(info)
return reliable
reliable = cross_reference(results)
print(f'Cross-referenced sources: {len(reliable)}')
for r in reliable[:3]:
print(f' [{r["cross_ref_count"]}x] {r["domain"]}: {r["title"][:50]}')Étape 4: Noter la confiance
Attribuez des scores de confiance aux résultats de recherche en fonction de l'accord et de la qualité des sources.
def score_confidence(reliable_sources: list, all_results: dict) -> list:
total_queries = len(all_results)
scored = []
for source in reliable_sources:
score = 0
# Cross-reference score (max 40)
score += min(source['cross_ref_count'] * 10, 40)
# Snippet quality (max 20)
if len(source.get('snippet', '')) > 100:
score += 20
elif len(source.get('snippet', '')) > 50:
score += 10
# Domain authority heuristic (max 20)
trusted = ['wikipedia', 'github', 'stackoverflow', 'docs.']
if any(t in source.get('domain', '').lower() for t in trusted):
score += 20
# Title relevance (max 20)
if source.get('title'):
score += 20
source['confidence'] = min(score, 100)
scored.append(source)
scored.sort(key=lambda x: x['confidence'], reverse=True)
return scored
scored = score_confidence(reliable, results)
for s in scored[:5]:
print(f' [{s["confidence"]}%] {s["title"][:50]}')Étape 5: Construire un contexte de recherche fondé
Formatez les résultats à haute confiance en un bloc de contexte pour l'agent.
def research_context(topic: str, min_confidence: int = 40) -> str:
all_results = multi_search(topic)
reliable = cross_reference(all_results)
scored = score_confidence(reliable, all_results)
high_conf = [s for s in scored if s['confidence'] >= min_confidence]
if not high_conf:
return f'No high-confidence sources found for "{topic}".'
parts = [f'RESEARCH CONTEXT: {topic}', f'Sources: {len(high_conf)} (confidence >= {min_confidence}%)', '']
for s in high_conf[:5]:
parts.append(f'[{s["confidence"]}% confidence] {s["title"]}')
parts.append(f' Source: {s["domain"]}')
if s.get('snippet'):
parts.append(f' Summary: {s["snippet"][:200]}')
parts.append('')
return '\n'.join(parts)
context = research_context('vector database')
print(context[:500])Exemple Python
import requests, os
H = {'x-api-key': os.environ['SCAVIO_API_KEY']}
def research(topic):
queries = [topic, f'{topic} comparison', f'{topic} overview']
all_snippets = []
for q in queries:
data = requests.post('https://api.scavio.dev/api/v1/search', headers=H,
json={'platform': 'google', 'query': q}).json()
for r in data.get('organic_results', [])[:3]:
all_snippets.append(r.get('snippet', '')[:100])
return all_snippets
print(research('vector database'))Exemple JavaScript
const H = {'x-api-key': process.env.SCAVIO_API_KEY, 'Content-Type': 'application/json'};
async function research(topic) {
const queries = [topic, `${topic} comparison`, `${topic} overview`];
const snippets = [];
for (const q of queries) {
const r = await fetch('https://api.scavio.dev/api/v1/search', {
method: 'POST', headers: H,
body: JSON.stringify({platform: 'google', query: q})
});
const results = (await r.json()).organic_results || [];
results.slice(0, 3).forEach(r => snippets.push((r.snippet || '').slice(0, 100)));
}
return snippets;
}
research('vector database').then(console.log);Sortie attendue
A multi-query research pipeline that expands topics into multiple search angles, cross-references sources, scores confidence, and outputs only high-confidence grounded research.