完成如何使用 llama.cpp 设置 yacy expert 进行本地搜索教程需要多长时间？

大多数开发者在15到30分钟内完成本教程。您需要一个Scavio API密钥（免费套餐即可）和可用的Python或JavaScript环境。

开始前需要准备什么？

Docker已安装. llama.cpp 至少 8GB RAM. GGUF 模型文件（例如 Mistral 7B Q4）. 已安装 Python 3.9+. 用于后备搜索的 Scavio API 密钥. Scavio API密钥注册即送50个免费积分。

我可以用免费套餐运行本教程吗？

可以。免费套餐注册即送50个积分，完全足够完成本教程并构建一个可运行的原型解决方案。

这支持哪些框架？

Scavio提供原生LangChain包（langchain-scavio）、MCP服务器以及适用于任何HTTP客户端的REST API。本教程使用 the raw REST API, 但您可以根据需要适配您选择的框架。

YaCy 专家 + llama.cpp 本地搜索 (2026)

YaCy 是一个去中心化的点对点搜索引擎，可以在不依赖任何中央服务器的情况下对网络进行爬行和索引。与用于本地 LLM 推理的 llama.cpp 相结合，您可以获得完全离线的 AI 搜索管道，且 API 成本为零。代价是索引质量——YaCy 索引了其同行共享的内容，这远远小于 Google 或 Bing。本教程设置 YaCy，通过 yacy_expert 桥将其连接到 llama.cpp，并为 YaCy 覆盖范围较薄的查询添加 Scavio 搜索回退。成本：本地查询 0 美元，每个 Scavio 后备 0.005 美元。

前置条件

Docker已安装
llama.cpp 至少 8GB RAM
GGUF 模型文件（例如 Mistral 7B Q4）
已安装 Python 3.9+
用于后备搜索的 Scavio API 密钥

操作指南

步骤 1: 在 Docker 中启动 YaCy

将 YaCy 作为 Docker 容器运行。管理界面在端口 8090 上运行，搜索 API 在端口 8090/yacysearch.json 上运行。

Bash

# Pull and run YaCy
docker run -d --name yacy \
  -p 8090:8090 \
  -v yacy_data:/opt/yacy_search_server/DATA \
  yacy/yacy_search_server:latest

# Wait for startup
echo 'Waiting for YaCy to initialize...'
sleep 15

# Test the search API
curl -s 'http://localhost:8090/yacysearch.json?query=python+programming&maximumRecords=3' | python3 -m json.tool | head -20

# Seed the index with some crawl targets
curl -s 'http://localhost:8090/Crawler_p.html?crawlingURL=https://docs.python.org&crawlingDepth=2&range=wide'

步骤 2: 设置 llama.cpp 服务器

将 llama.cpp 作为 OpenAI 兼容的 API 服务器运行。这处理 LLM 推断以总结搜索结果。

Bash

# Download and build llama.cpp (or use pre-built binary)
git clone https://github.com/ggerganov/llama.cpp
cd llama.cpp && make -j$(nproc)

# Start the server with your GGUF model
./llama-server \
  --model ~/models/mistral-7b-instruct-v0.3.Q4_K_M.gguf \
  --host 0.0.0.0 \
  --port 8080 \
  --ctx-size 4096 \
  --n-gpu-layers 35

# Test it
curl -s http://localhost:8080/v1/chat/completions \
  -H 'Content-Type: application/json' \
  -d '{"model": "local", "messages": [{"role": "user", "content": "Hello"}], "max_tokens": 50}'

步骤 3: 在 Python 中构建 yacy_expert 桥

创建一个 Python 脚本来查询 YaCy，将结果格式化为上下文，并将它们发送到 llama.cpp 以获得可靠的答案。

Python

import requests, os

YACY_URL = 'http://localhost:8090/yacysearch.json'
LLAMA_URL = 'http://localhost:8080/v1/chat/completions'
SCAVIO_KEY = os.environ.get('SCAVIO_API_KEY', '')

def yacy_search(query: str, count: int = 5) -> list:
    resp = requests.get(YACY_URL, params={
        'query': query, 'maximumRecords': count, 'resource': 'global'
    }, timeout=10)
    channels = resp.json().get('channels', [{}])
    items = channels[0].get('items', []) if channels else []
    return [{'title': r.get('title', ''), 'snippet': r.get('description', ''),
             'url': r.get('link', '')} for r in items]

def scavio_fallback(query: str, count: int = 5) -> list:
    if not SCAVIO_KEY:
        return []
    resp = requests.post('https://api.scavio.dev/api/v1/search',
        headers={'x-api-key': SCAVIO_KEY, 'Content-Type': 'application/json'},
        json={'query': query, 'country_code': 'us', 'num_results': count})
    return [{'title': r['title'], 'snippet': r.get('snippet', ''),
             'url': r['link']} for r in resp.json().get('organic_results', [])[:count]]

def search(query: str) -> list:
    results = yacy_search(query)
    if len(results) < 2:
        print('YaCy coverage thin, falling back to Scavio ($0.005)')
        results = scavio_fallback(query) + results
    return results

results = search('python asyncio tutorial')
for r in results:
    print(f'  {r["title"][:60]}')

步骤 4: 添加 LLM 支持的答案生成

将搜索结果发送到 llama.cpp 以生成有根据的引用答案。法学硕士仅总结搜索结果。

Python

def ask(query: str) -> str:
    results = search(query)
    if not results:
        return 'No results found in YaCy or fallback.'
    context = '\n\n'.join(
        f'[{i+1}] {r["title"]}\n{r["snippet"]}\nSource: {r["url"]}'
        for i, r in enumerate(results)
    )
    resp = requests.post(LLAMA_URL, json={
        'model': 'local',
        'messages': [
            {'role': 'system', 'content': 'Answer using ONLY the search results below. Cite sources as [1], [2], etc.'},
            {'role': 'user', 'content': f'Search results:\n{context}\n\nQuestion: {query}'}
        ],
        'max_tokens': 512,
        'temperature': 0.3
    }, timeout=60)
    answer = resp.json()['choices'][0]['message']['content']
    return answer

print(ask('How do I use asyncio gather in Python?'))

Python 示例

Python

import requests, os

YACY = 'http://localhost:8090/yacysearch.json'
LLAMA = 'http://localhost:8080/v1/chat/completions'
SCAVIO_KEY = os.environ.get('SCAVIO_API_KEY', '')

def search(query, count=5):
    results = requests.get(YACY, params={'query': query, 'maximumRecords': count}).json()
    items = results.get('channels', [{}])[0].get('items', [])
    if len(items) < 2 and SCAVIO_KEY:
        resp = requests.post('https://api.scavio.dev/api/v1/search',
            headers={'x-api-key': SCAVIO_KEY, 'Content-Type': 'application/json'},
            json={'query': query, 'country_code': 'us', 'num_results': count})
        items = [{'title': r['title'], 'description': r.get('snippet', ''), 'link': r['link']}
                 for r in resp.json().get('organic_results', [])]
    return items

def ask(query):
    results = search(query)
    ctx = '\n'.join(f'[{i+1}] {r.get("title","")}: {r.get("description","")}' for i, r in enumerate(results))
    resp = requests.post(LLAMA, json={'model': 'local', 'messages': [
        {'role': 'system', 'content': 'Answer from search results only. Cite [1],[2].'},
        {'role': 'user', 'content': f'{ctx}\n\nQ: {query}'}], 'max_tokens': 512})
    return resp.json()['choices'][0]['message']['content']

print(ask('python asyncio gather example'))

JavaScript 示例

JavaScript

const YACY = 'http://localhost:8090/yacysearch.json';
const LLAMA = 'http://localhost:8080/v1/chat/completions';
const SCAVIO_KEY = process.env.SCAVIO_API_KEY;

async function search(query, count = 5) {
  const yacyResp = await fetch(`${YACY}?query=${encodeURIComponent(query)}&maximumRecords=${count}`);
  let items = (await yacyResp.json()).channels?.[0]?.items || [];
  if (items.length < 2 && SCAVIO_KEY) {
    const resp = await fetch('https://api.scavio.dev/api/v1/search', {
      method: 'POST',
      headers: { 'x-api-key': SCAVIO_KEY, 'Content-Type': 'application/json' },
      body: JSON.stringify({ query, country_code: 'us', num_results: count })
    });
    const data = await resp.json();
    items = (data.organic_results || []).map(r => ({ title: r.title, description: r.snippet, link: r.link }));
  }
  return items;
}

async function ask(query) {
  const results = await search(query);
  const ctx = results.map((r, i) => `[${i+1}] ${r.title}: ${r.description}`).join('\n');
  const resp = await fetch(LLAMA, {
    method: 'POST',
    headers: { 'Content-Type': 'application/json' },
    body: JSON.stringify({ model: 'local', messages: [
      { role: 'system', content: 'Answer from search results only.' },
      { role: 'user', content: `${ctx}\n\nQ: ${query}` }], max_tokens: 512 })
  });
  console.log((await resp.json()).choices[0].message.content);
}

ask('python asyncio gather example');

预期输出

JSON

YaCy coverage thin, falling back to Scavio ($0.005)
  Python asyncio.gather() documentation
  Real Python: Async IO in Python
  Stack Overflow: How to use asyncio.gather

Based on the search results, asyncio.gather() runs multiple coroutines
concurrently and waits for all to complete [1]. You pass awaitable objects
as arguments: results = await asyncio.gather(coro1(), coro2()) [2]...

前置条件

Docker已安装
llama.cpp 至少 8GB RAM
GGUF 模型文件（例如 Mistral 7B Q4）
已安装 Python 3.9+
用于后备搜索的 Scavio API 密钥

操作指南

步骤 1: 在 Docker 中启动 YaCy

将 YaCy 作为 Docker 容器运行。管理界面在端口 8090 上运行，搜索 API 在端口 8090/yacysearch.json 上运行。

Bash

# Pull and run YaCy
docker run -d --name yacy \
  -p 8090:8090 \
  -v yacy_data:/opt/yacy_search_server/DATA \
  yacy/yacy_search_server:latest

# Wait for startup
echo 'Waiting for YaCy to initialize...'
sleep 15

# Test the search API
curl -s 'http://localhost:8090/yacysearch.json?query=python+programming&maximumRecords=3' | python3 -m json.tool | head -20

# Seed the index with some crawl targets
curl -s 'http://localhost:8090/Crawler_p.html?crawlingURL=https://docs.python.org&crawlingDepth=2&range=wide'

步骤 2: 设置 llama.cpp 服务器

将 llama.cpp 作为 OpenAI 兼容的 API 服务器运行。这处理 LLM 推断以总结搜索结果。

Bash

# Download and build llama.cpp (or use pre-built binary)
git clone https://github.com/ggerganov/llama.cpp
cd llama.cpp && make -j$(nproc)

# Start the server with your GGUF model
./llama-server \
  --model ~/models/mistral-7b-instruct-v0.3.Q4_K_M.gguf \
  --host 0.0.0.0 \
  --port 8080 \
  --ctx-size 4096 \
  --n-gpu-layers 35

# Test it
curl -s http://localhost:8080/v1/chat/completions \
  -H 'Content-Type: application/json' \
  -d '{"model": "local", "messages": [{"role": "user", "content": "Hello"}], "max_tokens": 50}'

步骤 3: 在 Python 中构建 yacy_expert 桥

创建一个 Python 脚本来查询 YaCy，将结果格式化为上下文，并将它们发送到 llama.cpp 以获得可靠的答案。

Python

import requests, os

YACY_URL = 'http://localhost:8090/yacysearch.json'
LLAMA_URL = 'http://localhost:8080/v1/chat/completions'
SCAVIO_KEY = os.environ.get('SCAVIO_API_KEY', '')

def yacy_search(query: str, count: int = 5) -> list:
    resp = requests.get(YACY_URL, params={
        'query': query, 'maximumRecords': count, 'resource': 'global'
    }, timeout=10)
    channels = resp.json().get('channels', [{}])
    items = channels[0].get('items', []) if channels else []
    return [{'title': r.get('title', ''), 'snippet': r.get('description', ''),
             'url': r.get('link', '')} for r in items]

def scavio_fallback(query: str, count: int = 5) -> list:
    if not SCAVIO_KEY:
        return []
    resp = requests.post('https://api.scavio.dev/api/v1/search',
        headers={'x-api-key': SCAVIO_KEY, 'Content-Type': 'application/json'},
        json={'query': query, 'country_code': 'us', 'num_results': count})
    return [{'title': r['title'], 'snippet': r.get('snippet', ''),
             'url': r['link']} for r in resp.json().get('organic_results', [])[:count]]

def search(query: str) -> list:
    results = yacy_search(query)
    if len(results) < 2:
        print('YaCy coverage thin, falling back to Scavio ($0.005)')
        results = scavio_fallback(query) + results
    return results

results = search('python asyncio tutorial')
for r in results:
    print(f'  {r["title"][:60]}')

步骤 4: 添加 LLM 支持的答案生成

将搜索结果发送到 llama.cpp 以生成有根据的引用答案。法学硕士仅总结搜索结果。

Python

def ask(query: str) -> str:
    results = search(query)
    if not results:
        return 'No results found in YaCy or fallback.'
    context = '\n\n'.join(
        f'[{i+1}] {r["title"]}\n{r["snippet"]}\nSource: {r["url"]}'
        for i, r in enumerate(results)
    )
    resp = requests.post(LLAMA_URL, json={
        'model': 'local',
        'messages': [
            {'role': 'system', 'content': 'Answer using ONLY the search results below. Cite sources as [1], [2], etc.'},
            {'role': 'user', 'content': f'Search results:\n{context}\n\nQuestion: {query}'}
        ],
        'max_tokens': 512,
        'temperature': 0.3
    }, timeout=60)
    answer = resp.json()['choices'][0]['message']['content']
    return answer

print(ask('How do I use asyncio gather in Python?'))

Python 示例

Python

import requests, os

YACY = 'http://localhost:8090/yacysearch.json'
LLAMA = 'http://localhost:8080/v1/chat/completions'
SCAVIO_KEY = os.environ.get('SCAVIO_API_KEY', '')

def search(query, count=5):
    results = requests.get(YACY, params={'query': query, 'maximumRecords': count}).json()
    items = results.get('channels', [{}])[0].get('items', [])
    if len(items) < 2 and SCAVIO_KEY:
        resp = requests.post('https://api.scavio.dev/api/v1/search',
            headers={'x-api-key': SCAVIO_KEY, 'Content-Type': 'application/json'},
            json={'query': query, 'country_code': 'us', 'num_results': count})
        items = [{'title': r['title'], 'description': r.get('snippet', ''), 'link': r['link']}
                 for r in resp.json().get('organic_results', [])]
    return items

def ask(query):
    results = search(query)
    ctx = '\n'.join(f'[{i+1}] {r.get("title","")}: {r.get("description","")}' for i, r in enumerate(results))
    resp = requests.post(LLAMA, json={'model': 'local', 'messages': [
        {'role': 'system', 'content': 'Answer from search results only. Cite [1],[2].'},
        {'role': 'user', 'content': f'{ctx}\n\nQ: {query}'}], 'max_tokens': 512})
    return resp.json()['choices'][0]['message']['content']

print(ask('python asyncio gather example'))

JavaScript 示例

JavaScript

const YACY = 'http://localhost:8090/yacysearch.json';
const LLAMA = 'http://localhost:8080/v1/chat/completions';
const SCAVIO_KEY = process.env.SCAVIO_API_KEY;

async function search(query, count = 5) {
  const yacyResp = await fetch(`${YACY}?query=${encodeURIComponent(query)}&maximumRecords=${count}`);
  let items = (await yacyResp.json()).channels?.[0]?.items || [];
  if (items.length < 2 && SCAVIO_KEY) {
    const resp = await fetch('https://api.scavio.dev/api/v1/search', {
      method: 'POST',
      headers: { 'x-api-key': SCAVIO_KEY, 'Content-Type': 'application/json' },
      body: JSON.stringify({ query, country_code: 'us', num_results: count })
    });
    const data = await resp.json();
    items = (data.organic_results || []).map(r => ({ title: r.title, description: r.snippet, link: r.link }));
  }
  return items;
}

async function ask(query) {
  const results = await search(query);
  const ctx = results.map((r, i) => `[${i+1}] ${r.title}: ${r.description}`).join('\n');
  const resp = await fetch(LLAMA, {
    method: 'POST',
    headers: { 'Content-Type': 'application/json' },
    body: JSON.stringify({ model: 'local', messages: [
      { role: 'system', content: 'Answer from search results only.' },
      { role: 'user', content: `${ctx}\n\nQ: ${query}` }], max_tokens: 512 })
  });
  console.log((await resp.json()).choices[0].message.content);
}

ask('python asyncio gather example');

预期输出

JSON

YaCy coverage thin, falling back to Scavio ($0.005)
  Python asyncio.gather() documentation
  Real Python: Async IO in Python
  Stack Overflow: How to use asyncio.gather

Based on the search results, asyncio.gather() runs multiple coroutines
concurrently and waits for all to complete [1]. You pass awaitable objects
as arguments: results = await asyncio.gather(coro1(), coro2()) [2]...

如何使用 llama.cpp 设置 YaCy Expert 进行本地搜索

前置条件

操作指南

步骤 1: 在 Docker 中启动 YaCy

步骤 2: 设置 llama.cpp 服务器

步骤 3: 在 Python 中构建 yacy_expert 桥

步骤 4: 添加 LLM 支持的答案生成

Python 示例

JavaScript 示例

预期输出

相关教程

常见问题

完成如何使用 llama.cpp 设置 yacy expert 进行本地搜索教程需要多长时间？

开始前需要准备什么？

我可以用免费套餐运行本教程吗？

这支持哪些框架？

相关资源

Google I/O 2026 AI模式变化后最佳搜索API

YaCy 搜索与 LLM 数据锚定流水线

搜索 API 供应商格局（2026）

2026年最佳单API大模型Wiki构建工具

免费搜索API层级对比

Search APIs (Scavio, Tavily, SerpAPI) vs Headless Browser (Playwright, Puppeteer, Browserbase)

开始构建

如何使用 llama.cpp 设置 YaCy Expert 进行本地搜索

前置条件

操作指南

步骤 1: 在 Docker 中启动 YaCy

步骤 2: 设置 llama.cpp 服务器

步骤 3: 在 Python 中构建 yacy_expert 桥

步骤 4: 添加 LLM 支持的答案生成

Python 示例

JavaScript 示例

预期输出

相关教程

常见问题

完成如何使用 llama.cpp 设置 yacy expert 进行本地搜索教程需要多长时间？

开始前需要准备什么？

我可以用免费套餐运行本教程吗？

这支持哪些框架？

相关资源

Google I/O 2026 AI模式变化后最佳搜索API

YaCy 搜索与 LLM 数据锚定流水线

搜索 API 供应商格局（2026）

2026年最佳单API大模型Wiki构建工具

免费搜索API层级对比

Search APIs (Scavio, Tavily, SerpAPI) vs Headless Browser (Playwright, Puppeteer, Browserbase)

开始构建