Overview
YaCy provides free P2P search via yacy_expert with llama.cpp, but results are inconsistent at volume and miss recent content. This workflow uses YaCy for broad discovery, then validates and enriches results through Scavio's structured API. The LLM gets grounded with verified, fresh data regardless of which provider sourced it.
Trigger
Every LLM grounding request that needs web context.
Schedule
On-demand
Workflow Steps
Query YaCy P2P Index
Send the query to the local YaCy instance. Collect results with URLs, titles, and snippets.
Score YaCy Results
Check result count and freshness. If YaCy returns fewer than 3 results or results are older than 30 days, flag for enrichment.
Enrich via Scavio
For flagged queries, call Scavio search API to get fresh, structured results with AI Overview and Knowledge Graph.
Merge and Deduplicate
Combine YaCy and Scavio results, deduplicate by URL, rank by freshness and relevance.
Format for LLM Context
Format the merged results as a grounding context block for the LLM prompt.
Python Implementation
import requests, os, json
API_KEY = os.environ["SCAVIO_API_KEY"]
H = {"x-api-key": API_KEY, "Content-Type": "application/json"}
YACY_URL = os.environ.get("YACY_URL", "http://localhost:8090")
def yacy_search(query: str) -> list:
"""Search local YaCy P2P index."""
try:
resp = requests.get(
f"{YACY_URL}/yacysearch.json",
params={"query": query, "maximumRecords": 10},
timeout=5,
)
channels = resp.json().get("channels", [{}])
return [{"title": r.get("title", ""), "url": r.get("link", ""), "snippet": r.get("description", "")}
for r in channels[0].get("items", [])]
except Exception:
return []
def scavio_search(query: str) -> list:
resp = requests.post(
"https://api.scavio.dev/api/v1/search",
headers=H,
json={"query": query, "country_code": "us"},
timeout=10,
)
data = resp.json()
return [{"title": r.get("title", ""), "url": r.get("link", ""), "snippet": r.get("snippet", "")}
for r in data.get("organic_results", [])]
def grounding_pipeline(query: str) -> str:
yacy_results = yacy_search(query)
if len(yacy_results) < 3:
scavio_results = scavio_search(query)
all_results = yacy_results + scavio_results
else:
all_results = yacy_results
# Deduplicate by URL
seen = set()
unique = [r for r in all_results if r["url"] not in seen and not seen.add(r["url"])]
# Format as LLM context
context = "\n\n".join(f"[{r['title']}]({r['url']}): {r['snippet']}" for r in unique[:8])
return context
context = grounding_pipeline("transformer architecture attention mechanism")
print(f"Grounding context ({len(context)} chars):\n{context[:500]}")JavaScript Implementation
const H = {'x-api-key': process.env.SCAVIO_API_KEY, 'Content-Type': 'application/json'};
const YACY_URL = process.env.YACY_URL || 'http://localhost:8090';
async function yacySearch(query) {
try {
const r = await fetch(YACY_URL+'/yacysearch.json?query='+encodeURIComponent(query)+'&maximumRecords=10');
const channels = (await r.json()).channels || [{}];
return (channels[0].items||[]).map(r=>({title:r.title||'', url:r.link||'', snippet:r.description||''}));
} catch { return []; }
}
async function scavioSearch(query) {
const r = await fetch('https://api.scavio.dev/api/v1/search', {method:'POST', headers:H, body:JSON.stringify({query, country_code:'us'})});
return ((await r.json()).organic_results||[]).map(r=>({title:r.title||'', url:r.link||'', snippet:r.snippet||''}));
}
async function groundingPipeline(query) {
let results = await yacySearch(query);
if (results.length < 3) results = results.concat(await scavioSearch(query));
const seen = new Set();
const unique = results.filter(r=>{ if (seen.has(r.url)) return false; seen.add(r.url); return true; });
return unique.slice(0,8).map(r=>'['+r.title+']('+r.url+'): '+r.snippet).join('\n\n');
}
const ctx = await groundingPipeline('transformer architecture attention mechanism');
console.log('Grounding context ('+ctx.length+' chars):\n'+ctx.slice(0,500));Platforms Used
Web search with knowledge graph, PAA, and AI overviews