DeepSeek V4 Web Search: What Powers It
DeepSeek V4 uses Serper and Exa as search backends. How to replicate the same search grounding architecture with any SERP API.
DeepSeek V4 uses a combination of Serper and Exa as its web search backends, routing queries through a search tool layer that fetches live results before generating responses. When you see DeepSeek cite current sources, the data comes from these third-party SERP providers -- not from DeepSeek crawling the web itself.
How DeepSeek routes search queries
DeepSeek V4 implements search as a tool call within its agentic loop. When the model determines it needs current information, it emits a search tool invocation. The backend dispatches this to Serper for Google SERP data (at $50/yr Dev tier for 50K queries) and Exa for semantic search (1K free/mo, $5/1K after). The results are injected into context as grounding documents.
The search grounding architecture
# Simplified version of how DeepSeek-style search grounding works
import os, requests
def search_ground(query, provider="serper"):
if provider == "serper":
resp = requests.post("https://google.serper.dev/search", json={
"q": query, "num": 5
}, headers={"X-API-KEY": os.environ["SERPER_KEY"]})
return [{"title": r["title"], "link": r["link"],
"snippet": r["snippet"]}
for r in resp.json().get("organic", [])]
elif provider == "scavio":
resp = requests.post("https://api.scavio.dev/api/v1/search",
headers={"x-api-key": os.environ["SCAVIO_API_KEY"]},
json={"query": query, "num_results": 5})
return [{"title": r["title"], "link": r["link"],
"snippet": r["snippet"]}
for r in resp.json().get("organic_results", [])]
# Inject results into LLM context
results = search_ground("deepseek v4 release date")
context = "\n".join(f"[{r['title']}]({r['link']}): {r['snippet']}"
for r in results)
prompt = f"Based on these sources:\n{context}\n\nAnswer: ..."Why this matters for your own agents
If DeepSeek -- a model with 671B parameters -- still needs live web search for accurate responses, your smaller agent definitely does. The pattern is clear: model generates a search query, API returns structured results, results get injected as context, model synthesizes an answer with citations.
Building your own search-grounded agent
import os, requests, json
SCAVIO_KEY = os.environ["SCAVIO_API_KEY"]
HEADERS = {"x-api-key": SCAVIO_KEY}
def grounded_answer(question):
# Step 1: Search
search_resp = requests.post(
"https://api.scavio.dev/api/v1/search",
headers=HEADERS,
json={"query": question, "num_results": 5,
"include_ai_overview": True},
)
data = search_resp.json()
sources = data.get("organic_results", [])[:5]
ai_overview = data.get("ai_overview", {})
# Step 2: Build grounding context
context_parts = []
if ai_overview:
context_parts.append(f"AI Overview: {ai_overview.get('text', '')}")
for s in sources:
context_parts.append(f"- {s['title']}: {s['snippet']}")
return {
"context": "\n".join(context_parts),
"sources": [{"title": s["title"], "url": s["link"]}
for s in sources],
}
result = grounded_answer("what search api does deepseek use")
print(json.dumps(result, indent=2))Cost of replicating DeepSeek search at scale
DeepSeek likely processes millions of search queries daily. For a typical AI application handling 10K search-grounded queries per day:
- Serper: $50/mo Pro (500K queries) -- covers it
- Exa: 300K/mo at $5/1K = $1,500/mo
- Scavio: 300K credits at $0.005 = $1,500/mo or $250 plan + overage
- Tavily: $100/mo Startup plan (limited queries)
Key insight
The search provider behind your AI does not need to be the same one DeepSeek uses. What matters is structured JSON output, low latency, and reliable uptime. Serper is Google-only. Exa is semantic-only. A multi-engine API gives you Google, Bing, Maps, Shopping, Reddit, and TikTok from one integration -- which is what most production agents actually need.