Building an AI Job Search Agent End-to-End
ATS subdomains plus Reddit hiring threads plus extract for full JD plus LLM scoring. Pattern from r/hiringcafe's launch.
An r/hiringcafe post (re-)launched an AI Job Search Agent designed to match power-user filters on a job board. The OP set realistic expectations: this is for users who don't know boolean operators or how to leverage 72+ filters. The pattern works. Builders shipping similar agents need a data layer that covers more than one board.
Why a single board isn't enough
Greenhouse, Lever, and Ashby host the freshest jobs on company subdomains. Reddit hiring threads (r/cscareerquestions, r/jobs, niche subs like r/devops or r/datascience) surface unannounced openings 3-7 days before boards. LinkedIn surfaces titles. A board like HiringCafe aggregates many sources but typically misses the Reddit thread layer entirely.
The agent shape
Take user inputs (skills, location, salary range, remote preference). Generate ATS-targeted SERP queries. Pull Reddit hiring threads in parallel. Extract full job descriptions for the top candidates. Score each against the resume with an LLM.
import os, requests
API_KEY = os.environ['SCAVIO_API_KEY']
H = {'x-api-key': API_KEY}
USER = {'skills': ['python', 'rust', 'mcp'], 'location': 'remote', 'min_salary': 150000}
def ats_jobs(user):
skills = ' '.join(user['skills'])
out = []
for d in ['greenhouse.io', 'lever.co', 'ashbyhq.com']:
r = requests.post('https://api.scavio.dev/api/v1/search', headers=H,
json={'query': f'site:{d} {skills} {user["location"]}'}).json()
out += r.get('organic_results', [])[:20]
return out
def reddit_jobs(user):
skills = ' '.join(user['skills'])
return requests.post('https://api.scavio.dev/api/v1/reddit/search', headers=H,
json={'query': f'{skills} hiring 2026'}).json().get('posts', [])[:20]The extract step
ATS pages return as clean markdown via the Scavio extract endpoint. The full JD goes into the LLM context. Most public job APIs return abridged metadata; the full description is what actually matches a candidate's resume.
The scoring step
Pass resume plus full JD to Claude Sonnet 4.6. Prompt: "Score 0-100 with a one-line reason for the fit gap." Sort candidates by score. Show the user the top 30. The user spends 5 minutes reviewing instead of an hour scrolling boards.
Cost math at one user, daily
3 ATS queries + 1 Reddit query + ~20 extracts on top hits + LLM scoring = ~25 Scavio credits/day = $0.11/day at the Project tier. Plus Claude API usage of about $0.20-0.50 for the scoring step. Total: $10-20/mo per active user. Honest constraint: at 1,000 users the API cost is real but still under $10K/mo — plausible for a product priced at $20-50/mo per user.
Why Reddit signals matter for job seekers
Founders post "we're hiring a senior backend" to r/SaaS or r/SideProject before the role hits the company's website. Recruiters share unposted roles in r/jobs threads. Engineering leads describe team needs in r/cscareerquestions comments. Surfacing those threads is a real differentiator that scrape-only or board-only agents miss.
The OP's honest constraint, repeated
An AI job search agent does not beat a power user with boolean operators. The agent beats the median user who doesn't know boolean operators exist. The right framing is "agent for non-power-users", not "agent that beats the boolean nerds."
Generalizing the pattern
Replace "jobs" with "real estate listings", "grant opportunities", "research papers", or any vertical where ATS-shaped pages live on company or institution subdomains. The agent shape is the same: SERP across target domains, extract for full content, LLM scoring against user context.