完成如何通过搜索 api 打好本地 llm 基础教程需要多长时间？

大多数开发者在15到30分钟内完成本教程。您需要一个Scavio API密钥（免费套餐即可）和可用的Python或JavaScript环境。

开始前需要准备什么？

正在运行的本地 LLM（Ollama、llama.cpp 服务器或 vLLM）. 已安装 Python 3.9+. 请求已安装库. 来自 scavio.dev 的 Scavio API 密钥. Scavio API密钥注册即送50个免费积分。

我可以用免费套餐运行本教程吗？

可以。免费套餐注册即送50个积分，完全足够完成本教程并构建一个可运行的原型解决方案。

这支持哪些框架？

Scavio提供原生LangChain包（langchain-scavio）、MCP服务器以及适用于任何HTTP客户端的REST API。本教程使用 the raw REST API, 但您可以根据需要适配您选择的框架。

带搜索 API 的本地法学硕士 (2026)

在 llama.cpp、Ollama 或 vLLM 上运行的本地 LLM 功能强大，但及时冻结。他们会产生当前事件、最新发布和实时数据的幻觉，因为他们的训练数据有一个截止点。添加搜索 API 为他们提供了实时基础：在回答之前，法学硕士会搜索网络并使用最新结果作为上下文。本教程适用于任何与 OpenAI 兼容的本地 LLM 端点。成本：每个接地答案 0.005 美元。

前置条件

正在运行的本地 LLM（Ollama、llama.cpp 服务器或 vLLM）
已安装 Python 3.9+
请求已安装库
来自 scavio.dev 的 Scavio API 密钥

操作指南

步骤 1: 连接到您当地的 LLM

设置与本地 LLM 的连接。可与任何 OpenAI 兼容端点（Ollama、llama.cpp 服务器、vLLM）配合使用。

Python

import requests

# Common local LLM endpoints:
# Ollama:     http://localhost:11434/v1/chat/completions
# llama.cpp:  http://localhost:8080/v1/chat/completions
# vLLM:       http://localhost:8000/v1/chat/completions

LLM_URL = 'http://localhost:11434/v1/chat/completions'  # Ollama default
LLM_MODEL = 'llama3'  # or 'mistral', 'codellama', etc.

def ask_llm(messages: list, max_tokens: int = 512) -> str:
    resp = requests.post(LLM_URL, json={
        'model': LLM_MODEL,
        'messages': messages,
        'max_tokens': max_tokens,
        'temperature': 0.3
    }, timeout=120)
    return resp.json()['choices'][0]['message']['content']

# Test connection
try:
    answer = ask_llm([{'role': 'user', 'content': 'Say hello in one word.'}], max_tokens=10)
    print(f'LLM connected: {answer}')
except Exception as e:
    print(f'LLM connection error: {e}')
    print('Make sure Ollama/llama.cpp is running.')

步骤 2: 添加搜索接地功能

构建一个搜索网络并将结果格式化为法学硕士上下文的函数。法学硕士只能看到搜索片段，不能看到完整的页面。

Python

import os

SCAVIO_KEY = os.environ['SCAVIO_API_KEY']

def search_context(query: str, count: int = 5) -> str:
    """Search the web and return formatted context for the LLM."""
    resp = requests.post('https://api.scavio.dev/api/v1/search',
        headers={'x-api-key': SCAVIO_KEY, 'Content-Type': 'application/json'},
        json={'query': query, 'country_code': 'us', 'num_results': count})
    results = resp.json().get('organic_results', [])
    if not results:
        return 'No search results found.'
    context = 'Search results (use these to answer accurately):\n\n'
    for i, r in enumerate(results, 1):
        context += f'[{i}] {r["title"]}\n'
        context += f'    {r.get("snippet", "")}\n'
        context += f'    Source: {r["link"]}\n\n'
    return context

# Test
ctx = search_context('Python 3.14 release date')
print(ctx[:300])

步骤 3: 建立接地气的答案管道

将搜索和 LLM 合并为一个功能。法学硕士接收搜索上下文，并且必须在其答案中引用来源。

Python

def grounded_answer(question: str) -> dict:
    """Answer a question using search-grounded local LLM."""
    # Step 1: Search for context
    context = search_context(question, count=5)
    # Step 2: Ask LLM with context
    messages = [
        {'role': 'system', 'content': (
            'You are a helpful assistant. Answer ONLY based on the search results provided. '
            'Cite sources as [1], [2], etc. If the search results do not contain the answer, '
            'say "I could not find this information in the search results."'
        )},
        {'role': 'user', 'content': f'{context}\nQuestion: {question}'}
    ]
    answer = ask_llm(messages, max_tokens=512)
    return {
        'question': question,
        'answer': answer,
        'grounded': True,
        'search_cost': 0.005
    }

# Test with a question that requires current data
result = grounded_answer('What is the latest version of Python?')
print(f'Q: {result["question"]}')
print(f'A: {result["answer"]}')
print(f'Grounded: {result["grounded"]}, Cost: ${result["search_cost"]}')

步骤 4: 添加智能接地（仅在需要时搜索）

并不是每个问题都需要搜索。添加一项检查，决定是否通过搜索或直接回答进行接地，从而节省成本。

Python

def needs_grounding(question: str) -> bool:
    """Heuristic: does this question need real-time data?"""
    grounding_triggers = [
        'latest', 'current', 'today', '2026', '2025', 'now',
        'price', 'cost', 'version', 'release', 'new', 'update',
        'best', 'top', 'compare', 'vs', 'alternative',
        'how much', 'where to', 'who is',
    ]
    q_lower = question.lower()
    return any(trigger in q_lower for trigger in grounding_triggers)

def smart_answer(question: str) -> dict:
    """Answer with search grounding only when needed."""
    if needs_grounding(question):
        return grounded_answer(question)
    # Direct LLM answer (no search cost)
    messages = [{'role': 'user', 'content': question}]
    answer = ask_llm(messages, max_tokens=512)
    return {
        'question': question,
        'answer': answer,
        'grounded': False,
        'search_cost': 0
    }

# Test both paths
for q in ['What is a Python list comprehension?',
          'What is the latest Python version in 2026?']:
    result = smart_answer(q)
    print(f'[{"GROUNDED" if result["grounded"] else "DIRECT"}] '
          f'${result["search_cost"]} - {q}')
    print(f'  {result["answer"][:100]}...')
    print()

Python 示例

Python

import requests, os

LLM_URL = 'http://localhost:11434/v1/chat/completions'
SCAVIO_KEY = os.environ['SCAVIO_API_KEY']

def search(query, count=5):
    resp = requests.post('https://api.scavio.dev/api/v1/search',
        headers={'x-api-key': SCAVIO_KEY, 'Content-Type': 'application/json'},
        json={'query': query, 'country_code': 'us', 'num_results': count})
    return resp.json().get('organic_results', [])

def grounded_ask(question):
    results = search(question)
    ctx = '\n'.join(f'[{i+1}] {r["title"]}: {r.get("snippet","")}' for i, r in enumerate(results))
    resp = requests.post(LLM_URL, json={'model': 'llama3', 'messages': [
        {'role': 'system', 'content': 'Answer from search results. Cite [1],[2].'},
        {'role': 'user', 'content': f'{ctx}\n\nQ: {question}'}], 'max_tokens': 512})
    return resp.json()['choices'][0]['message']['content']

print(grounded_ask('latest Python version 2026'))

JavaScript 示例

JavaScript

const LLM_URL = 'http://localhost:11434/v1/chat/completions';
const SCAVIO_KEY = process.env.SCAVIO_API_KEY;

async function groundedAsk(question) {
  const searchResp = await fetch('https://api.scavio.dev/api/v1/search', {
    method: 'POST',
    headers: { 'x-api-key': SCAVIO_KEY, 'Content-Type': 'application/json' },
    body: JSON.stringify({ query: question, country_code: 'us', num_results: 5 })
  });
  const results = (await searchResp.json()).organic_results || [];
  const ctx = results.map((r, i) => `[${i+1}] ${r.title}: ${r.snippet || ''}`).join('\n');
  const llmResp = await fetch(LLM_URL, {
    method: 'POST',
    headers: { 'Content-Type': 'application/json' },
    body: JSON.stringify({ model: 'llama3', messages: [
      { role: 'system', content: 'Answer from search results. Cite [1],[2].' },
      { role: 'user', content: `${ctx}\n\nQ: ${question}` }], max_tokens: 512 })
  });
  return (await llmResp.json()).choices[0].message.content;
}

groundedAsk('latest Python version 2026').then(console.log);

预期输出

JSON

LLM connected: Hello

Search results (use these to answer accurately):

[1] Python Release Python 3.14.0
    Python 3.14.0 was released on October 7, 2025...
    Source: https://www.python.org/downloads/release/python-3140/

Q: What is the latest version of Python?
A: According to the search results, the latest version of Python is 3.14.0,
released on October 7, 2025 [1].

[DIRECT] $0 - What is a Python list comprehension?
  A list comprehension is a concise way to create lists...

[GROUNDED] $0.005 - What is the latest Python version in 2026?
  The latest Python version is 3.14.0, released October 2025 [1]...

前置条件

正在运行的本地 LLM（Ollama、llama.cpp 服务器或 vLLM）
已安装 Python 3.9+
请求已安装库
来自 scavio.dev 的 Scavio API 密钥

操作指南

步骤 1: 连接到您当地的 LLM

设置与本地 LLM 的连接。可与任何 OpenAI 兼容端点（Ollama、llama.cpp 服务器、vLLM）配合使用。

Python

import requests

# Common local LLM endpoints:
# Ollama:     http://localhost:11434/v1/chat/completions
# llama.cpp:  http://localhost:8080/v1/chat/completions
# vLLM:       http://localhost:8000/v1/chat/completions

LLM_URL = 'http://localhost:11434/v1/chat/completions'  # Ollama default
LLM_MODEL = 'llama3'  # or 'mistral', 'codellama', etc.

def ask_llm(messages: list, max_tokens: int = 512) -> str:
    resp = requests.post(LLM_URL, json={
        'model': LLM_MODEL,
        'messages': messages,
        'max_tokens': max_tokens,
        'temperature': 0.3
    }, timeout=120)
    return resp.json()['choices'][0]['message']['content']

# Test connection
try:
    answer = ask_llm([{'role': 'user', 'content': 'Say hello in one word.'}], max_tokens=10)
    print(f'LLM connected: {answer}')
except Exception as e:
    print(f'LLM connection error: {e}')
    print('Make sure Ollama/llama.cpp is running.')

步骤 2: 添加搜索接地功能

构建一个搜索网络并将结果格式化为法学硕士上下文的函数。法学硕士只能看到搜索片段，不能看到完整的页面。

Python

import os

SCAVIO_KEY = os.environ['SCAVIO_API_KEY']

def search_context(query: str, count: int = 5) -> str:
    """Search the web and return formatted context for the LLM."""
    resp = requests.post('https://api.scavio.dev/api/v1/search',
        headers={'x-api-key': SCAVIO_KEY, 'Content-Type': 'application/json'},
        json={'query': query, 'country_code': 'us', 'num_results': count})
    results = resp.json().get('organic_results', [])
    if not results:
        return 'No search results found.'
    context = 'Search results (use these to answer accurately):\n\n'
    for i, r in enumerate(results, 1):
        context += f'[{i}] {r["title"]}\n'
        context += f'    {r.get("snippet", "")}\n'
        context += f'    Source: {r["link"]}\n\n'
    return context

# Test
ctx = search_context('Python 3.14 release date')
print(ctx[:300])

步骤 3: 建立接地气的答案管道

将搜索和 LLM 合并为一个功能。法学硕士接收搜索上下文，并且必须在其答案中引用来源。

Python

def grounded_answer(question: str) -> dict:
    """Answer a question using search-grounded local LLM."""
    # Step 1: Search for context
    context = search_context(question, count=5)
    # Step 2: Ask LLM with context
    messages = [
        {'role': 'system', 'content': (
            'You are a helpful assistant. Answer ONLY based on the search results provided. '
            'Cite sources as [1], [2], etc. If the search results do not contain the answer, '
            'say "I could not find this information in the search results."'
        )},
        {'role': 'user', 'content': f'{context}\nQuestion: {question}'}
    ]
    answer = ask_llm(messages, max_tokens=512)
    return {
        'question': question,
        'answer': answer,
        'grounded': True,
        'search_cost': 0.005
    }

# Test with a question that requires current data
result = grounded_answer('What is the latest version of Python?')
print(f'Q: {result["question"]}')
print(f'A: {result["answer"]}')
print(f'Grounded: {result["grounded"]}, Cost: ${result["search_cost"]}')

步骤 4: 添加智能接地（仅在需要时搜索）

并不是每个问题都需要搜索。添加一项检查，决定是否通过搜索或直接回答进行接地，从而节省成本。

Python

def needs_grounding(question: str) -> bool:
    """Heuristic: does this question need real-time data?"""
    grounding_triggers = [
        'latest', 'current', 'today', '2026', '2025', 'now',
        'price', 'cost', 'version', 'release', 'new', 'update',
        'best', 'top', 'compare', 'vs', 'alternative',
        'how much', 'where to', 'who is',
    ]
    q_lower = question.lower()
    return any(trigger in q_lower for trigger in grounding_triggers)

def smart_answer(question: str) -> dict:
    """Answer with search grounding only when needed."""
    if needs_grounding(question):
        return grounded_answer(question)
    # Direct LLM answer (no search cost)
    messages = [{'role': 'user', 'content': question}]
    answer = ask_llm(messages, max_tokens=512)
    return {
        'question': question,
        'answer': answer,
        'grounded': False,
        'search_cost': 0
    }

# Test both paths
for q in ['What is a Python list comprehension?',
          'What is the latest Python version in 2026?']:
    result = smart_answer(q)
    print(f'[{"GROUNDED" if result["grounded"] else "DIRECT"}] '
          f'${result["search_cost"]} - {q}')
    print(f'  {result["answer"][:100]}...')
    print()

Python 示例

Python

import requests, os

LLM_URL = 'http://localhost:11434/v1/chat/completions'
SCAVIO_KEY = os.environ['SCAVIO_API_KEY']

def search(query, count=5):
    resp = requests.post('https://api.scavio.dev/api/v1/search',
        headers={'x-api-key': SCAVIO_KEY, 'Content-Type': 'application/json'},
        json={'query': query, 'country_code': 'us', 'num_results': count})
    return resp.json().get('organic_results', [])

def grounded_ask(question):
    results = search(question)
    ctx = '\n'.join(f'[{i+1}] {r["title"]}: {r.get("snippet","")}' for i, r in enumerate(results))
    resp = requests.post(LLM_URL, json={'model': 'llama3', 'messages': [
        {'role': 'system', 'content': 'Answer from search results. Cite [1],[2].'},
        {'role': 'user', 'content': f'{ctx}\n\nQ: {question}'}], 'max_tokens': 512})
    return resp.json()['choices'][0]['message']['content']

print(grounded_ask('latest Python version 2026'))

JavaScript 示例

JavaScript

const LLM_URL = 'http://localhost:11434/v1/chat/completions';
const SCAVIO_KEY = process.env.SCAVIO_API_KEY;

async function groundedAsk(question) {
  const searchResp = await fetch('https://api.scavio.dev/api/v1/search', {
    method: 'POST',
    headers: { 'x-api-key': SCAVIO_KEY, 'Content-Type': 'application/json' },
    body: JSON.stringify({ query: question, country_code: 'us', num_results: 5 })
  });
  const results = (await searchResp.json()).organic_results || [];
  const ctx = results.map((r, i) => `[${i+1}] ${r.title}: ${r.snippet || ''}`).join('\n');
  const llmResp = await fetch(LLM_URL, {
    method: 'POST',
    headers: { 'Content-Type': 'application/json' },
    body: JSON.stringify({ model: 'llama3', messages: [
      { role: 'system', content: 'Answer from search results. Cite [1],[2].' },
      { role: 'user', content: `${ctx}\n\nQ: ${question}` }], max_tokens: 512 })
  });
  return (await llmResp.json()).choices[0].message.content;
}

groundedAsk('latest Python version 2026').then(console.log);

预期输出

JSON

LLM connected: Hello

Search results (use these to answer accurately):

[1] Python Release Python 3.14.0
    Python 3.14.0 was released on October 7, 2025...
    Source: https://www.python.org/downloads/release/python-3140/

Q: What is the latest version of Python?
A: According to the search results, the latest version of Python is 3.14.0,
released on October 7, 2025 [1].

[DIRECT] $0 - What is a Python list comprehension?
  A list comprehension is a concise way to create lists...

[GROUNDED] $0.005 - What is the latest Python version in 2026?
  The latest Python version is 3.14.0, released October 2025 [1]...

如何通过搜索 API 打好本地 LLM 基础

前置条件

操作指南

步骤 1: 连接到您当地的 LLM

步骤 2: 添加搜索接地功能

步骤 3: 建立接地气的答案管道

步骤 4: 添加智能接地（仅在需要时搜索）

Python 示例

JavaScript 示例

预期输出

相关教程

常见问题

完成如何通过搜索 api 打好本地 llm 基础教程需要多长时间？

开始前需要准备什么？

我可以用免费套餐运行本教程吗？

这支持哪些框架？

相关资源

Google I/O 2026 AI模式变化后最佳搜索API

搜索 API 供应商格局（2026）

2026年最佳单API大模型Wiki构建工具

免费搜索API层级对比

Search APIs (Scavio, Tavily, SerpAPI) vs Headless Browser (Playwright, Puppeteer, Browserbase)

Google Places API vs SERP Local Pack API

开始构建

如何通过搜索 API 打好本地 LLM 基础

前置条件

操作指南

步骤 1: 连接到您当地的 LLM

步骤 2: 添加搜索接地功能

步骤 3: 建立接地气的答案管道

步骤 4: 添加智能接地（仅在需要时搜索）

Python 示例

JavaScript 示例

预期输出

相关教程

常见问题

完成如何通过搜索 api 打好本地 llm 基础教程需要多长时间？

开始前需要准备什么？

我可以用免费套餐运行本教程吗？

这支持哪些框架？

相关资源

Google I/O 2026 AI模式变化后最佳搜索API

搜索 API 供应商格局（2026）

2026年最佳单API大模型Wiki构建工具

免费搜索API层级对比

Search APIs (Scavio, Tavily, SerpAPI) vs Headless Browser (Playwright, Puppeteer, Browserbase)

Google Places API vs SERP Local Pack API

开始构建