RAG Banking Chatbot: The Scavio + Firecrawl-Alternative Stack
Build a production-grade RAG banking chatbot with PII masking, citation trails, and Scavio as the public-source layer. Why Firecrawl is the wrong default.
A recent r/LangChain thread asked how to build a production-grade RAG chatbot for a bank. The obvious answer is Firecrawl plus an LLM. The non-obvious answer is that banks are regulated, every citation needs provenance, and PII must never leave the building. This post is the real architecture, not the demo.
What Makes Banking RAG Different
Consumer RAG chatbots optimize for answer quality. Banking RAG optimizes for three constraints that override answer quality:
- PII masking before any data leaves the bank's infrastructure. Customer names, account numbers, SSNs, and balances must be stripped or tokenized before hitting any cloud LLM.
- Source provenance on every answer. A regulator asking "why did the chatbot say this?" needs a citation trail: which document, which paragraph, which date.
- Refusal on low confidence. A bank chatbot that guesses wrong about interest calculation is a compliance incident. Silence is better than a confident hallucination.
Why Firecrawl Is the Wrong Default
Firecrawl converts webpages to markdown. It does this well. But a banking RAG chatbot rarely crawls the public internet. The corpus is mostly internal policy documents, regulatory filings from the FDIC and OCC, and occasionally public rate schedules from competitor sites. The crawl layer is maybe 10% of the work. The other 90% is PII handling, citation tracking, and refusal logic.
Worse, at scale Firecrawl's per-page cost climbs faster than usage. A bank indexing 50,000 policy pages monthly on Firecrawl's Growth tier lands around $400/mo just for markdown conversion, with none of the banking-specific controls built in.
The Reference Architecture
Split the pipeline into three clean layers: ingestion, retrieval, and compose. Each layer has its own security boundary.
- Ingestion. Internal policy docs flow through an on-prem Presidio scrubber. Public sources (rate schedules, regulator filings) come through Scavio for typed JSON with source URLs preserved.
- Retrieval. Vector store (pgvector or Weaviate) with metadata columns for source type, date, and jurisdiction. Filter queries by tenant before semantic search.
- Compose. LLM receives a strict system prompt, cited context only, and returns a structured response with inline citations.
Ingestion with PII Masking
Presidio detects PII with configurable patterns. Run it before any document touches the cloud.
from presidio_analyzer import AnalyzerEngine
from presidio_anonymizer import AnonymizerEngine
analyzer = AnalyzerEngine()
anonymizer = AnonymizerEngine()
def scrub(text: str) -> str:
results = analyzer.analyze(text=text, language='en',
entities=['US_SSN', 'PHONE_NUMBER', 'EMAIL_ADDRESS', 'PERSON'])
return anonymizer.anonymize(text=text, analyzer_results=results).textPublic Sources Through Scavio
When the chatbot needs FDIC guidance or a competitor's posted rate, Scavio returns typed JSON with source URLs that persist into the vector store metadata. The citation trail is built in.
import os, requests
API_KEY = os.environ['SCAVIO_API_KEY']
def fetch_public_source(query: str):
r = requests.post('https://api.scavio.dev/api/v1/search',
headers={'x-api-key': API_KEY},
json={'query': query, 'platform': 'google', 'num_results': 5})
return [{
'title': x['title'],
'url': x['link'],
'snippet': x['snippet'],
'fetched_at': __import__('datetime').datetime.utcnow().isoformat()
} for x in r.json().get('organic_results', [])]Compose with Guarded Prompt
The system prompt is where refusal behavior lives. No exceptions, no clever workarounds. A bank chatbot that refuses is boring. A bank chatbot that confidently invents a KYC rule is a regulator visit.
SYSTEM PROMPT:
You are a banking assistant. Rules:
1. Cite every factual claim with [source N].
2. If the retrieved context does not contain the answer, respond:
"I cannot answer with confidence. Please contact support."
3. Never repeat customer account numbers, SSNs, or balances in your answer.
4. If the question asks for an action (transfer, close account), respond:
"I can explain policy but cannot perform account actions."What the Regulator Actually Asks For
When the regulator shows up, they want: a log of every question asked, the sources retrieved, the model version, the response, and any refusal triggers. Build this log from day one. A structured Postgres table with session_id, question_scrubbed, sources_json, response, and refusal_reason is the minimum viable audit trail.
Why This Works Cheaper Than Firecrawl Plus LLM
Scavio at $30/mo for 7,000 credits covers a bank's public source enrichment for months. Firecrawl at $399/mo Growth is the wrong shape: the bank does not need markdown conversion, it needs typed citations. The internal corpus never touches either vendor because Presidio runs on-prem. Net cloud spend drops while the compliance posture improves.
The full tutorial with working code is in the how-to-build-rag-chatbot-for-regulated-industries tutorial.