privacyself-hostedllm

SearXNG + Hermes + Qwen: Private Search Stack Tradeoffs

Self-hosted search with SearXNG, Hermes 3, and Qwen 3. Full data sovereignty but noisier results. When private search is worth the tradeoff.

6 min read

SearXNG for metasearch, Hermes 3 for reasoning, Qwen 3 for code generation -- this stack gives you a fully private search assistant that never sends queries to third-party APIs. The tradeoff is result quality: self-hosted metasearch returns noisier data than commercial search APIs, so you need aggressive filtering.

The Private Search Stack

SearXNG is a self-hosted metasearch engine that aggregates results from 70+ search engines without tracking. Run it on your own server and it queries Google, Bing, DuckDuckGo, and others on your behalf. Hermes 3 (Nous Research, runs on Ollama) handles reasoning and answer synthesis. Qwen 3 (Alibaba, 32B parameter model) handles code generation tasks.

Total cost: a single VPS with 32GB RAM for SearXNG + Ollama running both models. No per-query API fees. Full data sovereignty.

Setting Up SearXNG

Bash
# Docker Compose for SearXNG
# docker-compose.yml
# services:
#   searxng:
#     image: searxng/searxng:latest
#     ports:
#       - "8080:8080"
#     volumes:
#       - ./searxng:/etc/searxng
#     environment:
#       - SEARXNG_BASE_URL=http://localhost:8080

# Start SearXNG
docker compose up -d

# Test it
curl "http://localhost:8080/search?q=test&format=json" | python -m json.tool

Connecting SearXNG to LLMs

Python
import requests

SEARXNG_URL = "http://localhost:8080"
OLLAMA_URL = "http://localhost:11434"

def private_search(query):
    """Search via SearXNG, synthesize with Hermes 3."""
    # Step 1: Search
    r = requests.get(f"{SEARXNG_URL}/search",
        params={"q": query, "format": "json"},
        timeout=15
    ).json()
    results = r.get("results", [])[:5]
    context = "\n".join([
        f"- {r['title']}: {r.get('content', '')[:200]}"
        for r in results
    ])

    # Step 2: Synthesize with Hermes 3
    answer = requests.post(f"{OLLAMA_URL}/api/generate",
        json={
            "model": "hermes3",
            "prompt": f"Based on these search results:\n{context}\n\nAnswer: {query}",
            "stream": False,
        },
        timeout=30
    ).json()
    return answer.get("response", "")

print(private_search("best private search engine 2026"))

The Quality Problem

SearXNG aggregates from multiple engines, but the results are noisier than commercial search APIs. Google rate-limits SearXNG instances aggressively, so you often get results from secondary engines (Brave, Mojeek, Qwant) that have smaller indexes. Result quality drops noticeably on niche queries.

If you need consistent result quality without managing SearXNG uptime and engine rotation, a commercial search API returns cleaner data with less operational overhead.

Python
import os

# Commercial alternative: consistent quality, no self-hosting
H = {"x-api-key": os.environ["SCAVIO_API_KEY"]}

def reliable_search(query):
    """Scavio API -- no self-hosting, structured JSON."""
    r = requests.post("https://api.scavio.dev/api/v1/search",
        headers=H,
        json={"platform": "google", "query": query},
        timeout=10
    ).json()
    return r.get("organic", [])

# Same query, different quality levels
private_results = private_search("mcp server setup guide")
api_results = reliable_search("mcp server setup guide")
print(f"Private: noisy but sovereign")
print(f"API: clean but third-party")

When to Use Each Approach

Use the private stack when: queries contain sensitive data (medical, legal, financial), you cannot send queries to third-party APIs due to compliance, or you need full audit trails. Use a commercial API when: result quality matters more than data sovereignty, you need multi-platform data (Amazon, YouTube, Reddit), or you do not want to maintain SearXNG infrastructure.

Hybrid Architecture

Run SearXNG for sensitive queries and route non-sensitive queries to a commercial API. Tag each query with a sensitivity level and route accordingly. You get data sovereignty where it matters and result quality where it matters.