An r/crewai post documented an SDR agent using Google Dorks on Serper plus pdfplumber plus Llama-3 plus an MCP cache. This tutorial swaps Serper for Scavio and shows the same pattern with a typed JSON cache.
Prerequisites
- Python 3.10+
- CrewAI
- Scavio API key
Walkthrough
Step 1: Define the Scavio CrewAI Tool
Subclass CrewAI's BaseTool.
from crewai.tools import BaseTool
import requests, os
class ScavioSearch(BaseTool):
name = 'scavio_search'
description = 'Multi-platform web search returning typed JSON. Use search_type="dorks" for Google Dorks.'
def _run(self, query: str):
return requests.post('https://api.scavio.dev/api/v1/search',
headers={'x-api-key': os.environ['SCAVIO_API_KEY']},
json={'query': query}).json()Step 2: Define the Google Dorks query pattern
Same dork strings as the original.
DORK_PATTERNS = [
'site:gov.br filetype:pdf 2026 contratos',
'site:gob.mx filetype:pdf 2026 licitaciones'
]Step 3: Crontab driver
Same cron, swap the API.
# crontab -e
# 0 6 * * * /usr/bin/python /path/to/dorks.pyStep 4: Cache layer in SQLite (typed JSON now)
Cache key = (query, surface). Value = JSON string.
import sqlite3, json
conn = sqlite3.connect('cache.db')
conn.execute('CREATE TABLE IF NOT EXISTS cache(key TEXT PRIMARY KEY, payload TEXT, ts REAL)')
def cached_search(q):
row = conn.execute('SELECT payload FROM cache WHERE key=?', (q,)).fetchone()
if row: return json.loads(row[0])
data = ScavioSearch()._run(q)
conn.execute('INSERT OR REPLACE INTO cache VALUES (?, ?, ?)', (q, json.dumps(data), 0))
conn.commit()
return dataStep 5: Plug into CrewAI agent
Same agent shape, Scavio tool replaces Serper tool.
from crewai import Agent
researcher = Agent(role='Government Bid Researcher', tools=[ScavioSearch()])Python Example
# See steps above for full pattern.JavaScript Example
// CrewAI is Python-first; equivalent Mastra/JS pattern uses Scavio HTTP directly.Expected Output
SDR agent fetches government bid PDFs the same as before, now with Reddit thread surfacing as a second source layer at no additional vendor.