Perplexity Sonar Credits Wiped: What Happened (2026)
A Perplexity user reported Sonar API credits wiped without explanation. The structural issue: bundling search with LLM inference creates coupled failure modes. Unbundling is the production fix.
An r/Perplexity post: "SONAR API credits wiped off." The user logged in to find their prepaid Sonar API credits gone without explanation. This is the second time in 2026 that Sonar API reliability issues have surfaced on Reddit, following earlier reports of 503 errors and hallucinated citations.
The pattern
Perplexity Sonar API launched as a search + inference bundle: $5/1,000 requests plus per-token charges for the generated answer. The product targets developers who want "search + answer" in one call. But bundling search with LLM inference creates a coupled failure mode: when the inference layer has issues, the search layer goes down with it.
Why unbundled search matters for production
When your search API bundles LLM inference, you inherit two failure surfaces: search availability and model availability. A pure search API like Scavio decouples these: if your LLM provider has issues, your search layer continues working. You can retry the LLM call with cached search results instead of losing both.
Migration path for Sonar users
# Sonar bundles search + inference. To unbundle:
# 1. Search via Scavio ($0.005/query)
# 2. Inference via your preferred LLM (Claude, GPT, local)
import requests, os
resp = requests.post("https://api.scavio.dev/api/v1/search",
headers={"x-api-key": os.environ["SCAVIO_API_KEY"]},
json={"query": "latest AI news 2026", "platform": "google", "limit": 5})
results = resp.json().get("results", [])
# Feed results to any LLM — you choose the model, you control the cost
context = "\n".join(f"- {r['title']}: {r['snippet']}" for r in results)
# answer = your_llm.complete(f"Answer using: {context}")Cost comparison
Sonar: $5/1,000 requests + ~$0.003-0.008 per response in token charges. Scavio search only: $0.005/query, bring your own LLM. For high-volume agent workloads, unbundling lets you optimize search and inference costs independently. For low-volume conversational use, Sonar's bundled convenience may still justify the premium.
The honest take
Credit wipes and 503 errors are concerning for production use, but Perplexity is actively scaling. The structural issue is coupling: if you need search reliability independent of LLM availability, unbundle. If you want convenience and accept the coupled risk, Sonar works when it works.