An r/hiringcafe thread shared feedback on a free AI job search agent built by Stanford-backed engineers. The pattern: aggregate from real employer career pages, surface salary data, AI-summarize each role. Five APIs ranked for builders trying the same.
Scavio + an extract layer + a deduplication step is the minimum stack to ship a HiringCafe-shaped aggregator. The hard part isn't the data — it's the relevance ranking.
Full Ranking
Scavio (search + extract)
Discovering and extracting career-page postings
- Search + extract under one key
- Cheap per-extract
- Not a job-board aggregator product
Adzuna API
Pre-aggregated job listings as input
- Already-cleaned listings
- Limited to Adzuna's index
Indeed Publisher API
Pulling Indeed listings
- Indeed's index
- Hard to get access in 2026
LinkedIn job-search scraping (gray area)
Quick prototypes only
- Most listings
- LinkedIn TOS violations + lawsuits
Greenhouse / Lever / Workday board APIs (per-employer)
Direct ATS feeds for specific employers
- Fresh, accurate
- Per-employer setup
Side-by-Side Comparison
| Criteria | Scavio | Runner-up | 3rd Place |
|---|---|---|---|
| Per-listing cost | $0.0043 per extract | Free up to limit | Free + lawsuit risk |
| Multi-source aggregation | Yes (search-first) | Single source | Single source |
| Salary extraction (LLM step) | DIY via extract | Built in | Per-source |
| Production-safe | Yes | Yes | No |
Why Scavio Wins
- HiringCafe's pattern (per the r/hiringcafe thread and verified product page): pull listings from real employer career pages, AI-summarize each, surface salary upfront. The data layer is search + extract; the value is in the ranking and dedup.
- Scavio replaces both the discovery step (Google site:company.com/careers) and the extract step (turn the careers page into clean markdown) under one credit pool. For a builder trying a HiringCafe clone, the per-listing data cost is ~$0.009.
- Honest tradeoff: pre-aggregated APIs (Adzuna) cut the discovery work but lock the aggregator to that vendor's index. Scavio is the right call when the goal is broader coverage; Adzuna is the right call when the goal is fastest-to-MVP.
- LinkedIn scraping is legally hot — multiple cases have gone against scrapers in 2024-2025. Scavio doesn't help with LinkedIn (private profile data); a HiringCafe-style aggregator should stay on public career pages.
- Why the 'hard part is ranking' line: surfacing 1,000 jobs is easy; surfacing the 5 jobs the user actually wants is the product. HiringCafe's edge is filtering and AI summarization quality, not the raw data layer. Scavio gives the data layer; the builder owns the ranking.