productresearchscavio

Building a Vertical Mini-Perplexity (2026)

An r/buildinpublic post launched Olivepress for stock research. Pin to authoritative sources + Scavio for SERP/social gap + chart-as-tool-call.

May 1, 2026

5 min read

An r/buildinpublic post launched Olivepress, a Perplexity-like research tool focused on stock markets using FRED and FMP as authoritative sources. The pitch is clean: Perplexity hallucinates and pulls irrelevant data; Olivepress pins to known-good sources and outputs native charts, not Python- code-slop. The pattern generalizes — vertical Perplexity is a real product category if the wrap does real work.

Why generic Perplexity hits a ceiling for vertical research

Generic Perplexity pulls from the open web. For finance, that means Reddit speculation, random Medium posts, and outdated press releases mixed with authoritative sources. The model can't reliably distinguish FRED data from a forum post unless explicitly grounded. For someone making a trading decision, that uncertainty kills trust.

The vertical-Perplexity fix

Pin to authoritative sources for the vertical:

Finance: FRED (macro), FMP (fundamentals), SEC EDGAR (filings).
Real estate: HUD, FRED housing series, county assessors.
Healthcare: NIH PubMed, CMS, FDA.
Legal: case law databases, statute directories.
Government / public sector: state portals, USAGov, EU equivalents.

Reject web noise as default. Allow web-noise layered on with explicit user intent.

Where Scavio fits — the social/SERP layer

Authoritative sources cover the canonical data; they don't cover "what are traders saying about NVDA earnings". Scavio fills that gap:

Python

import requests, os
H = {'x-api-key': os.environ['SCAVIO_API_KEY']}

def social_layer(ticker):
    queries = [
        f'site:reddit.com r/stocks {ticker} 2026',
        f'site:reddit.com r/wallstreetbets {ticker} 2026',
        f'{ticker} earnings 2026 site:wsj.com OR site:reuters.com',
    ]
    out = []
    for q in queries:
        r = requests.post(
            'https://api.scavio.dev/api/v1/search',
            headers=H,
            json={'query': q, 'include_ai_overview': True}
        ).json()
        out.append({'query': q, 'results': r.get('organic_results', [])[:5]})
    return out

The cite-or-abstain composition

LLM prompt pattern that actually works for vertical Perplexity:

Text

Answer using ONLY the sources below. Every factual claim
ends with [N] where N is the source index. If sources
contradict, say so. If sources don't answer, say "I don't
know based on these sources."

Sources:
[1] FRED Series GDP-2026Q1: <data>
[2] FMP NVDA fundamentals: <data>
[3] SEC EDGAR NVDA 10-Q: <data>
[4] Reddit WSB top thread: <data>
[5] WSJ earnings article: <data>

Question: Is NVDA a buy at current price given Q1 macro?

Why charts as native tool calls beat python-code-slop

The OP's point: current flagship models often output Python code that the UI then executes to render charts. The result is fragile (code crashes, wrong chart type, mismatched scales). Native chart-render tools the LLM calls directly produce polished output reliably.

Text

Define a chart-render tool:
  function: render_chart
  args: { type, data, x_label, y_label, title }

LLM call:
  render_chart(type="line",
    data=fred_series,
    x_label="Quarter",
    y_label="GDP %",
    title="US GDP 2024-2026")

UI renders via Recharts/Plotly. Polished, no code-slop.

The defensibility lives in curation, not the LLM

Olivepress could swap from Claude to GPT to a local Llama tomorrow without losing what makes it useful. What makes it useful is:

The pinned source list (FRED + FMP + SEC).
The LLM citation discipline (every claim cited).
The native chart-render tools.
The UX targeted at one user (stock researchers).

None of those are LLM features. All of them are product work.

Per-product-month MVP cost

Source feeds: ~$5-15/mo (FRED is free, FMP is paid tiered). Scavio: $30/mo for 7K credits. LLM tokens: ~$30-100/mo at MVP volume. Compute (Streamlit / Vercel): $0-20/mo. Total: under $200/mo to ship a working niche-Perplexity. Most of the spend is product work, not infra.

The mini-Perplexity verticals that work

Finance / equity research (Olivepress, similar).
Real estate (rent / housing market analysis).
Healthcare professional research.
Legal research for non-lawyers.
Government / civic research.
Academic field-specific (papers + grants + funding).

The verticals that don't pay back

General-knowledge research — that's Perplexity's home turf.
Verticals where authoritative sources don't exist as APIs (you can't pin to them).
Verticals where users don't pay for trust (B2C casual research).

The honest moat

Source curation is hard work that doesn't look like AI work. Vendor-of-vendor relationships, data licensing, structural awareness of what's authoritative in the vertical — that's the moat. The LLM is the easy part. Don't mistake the easy part for the product.