Beginner AI Agent Stack: What You Actually Need
Four components for a beginner agent: LLM, search API, function calling, and SQLite. Total cost under $30/month.
A beginner AI agent stack in 2026 needs four components: an LLM (Claude or GPT-4o), a web search API for grounding, a framework for orchestration (LangGraph or raw function calling), and a vector store for memory. Total cost for a hobby project: under $30/month. You do not need Kubernetes, a multi-agent framework, or a dedicated MLOps platform to start.
The four layers
- LLM: Claude Sonnet or GPT-4o-mini for cost-effective reasoning
- Search: any SERP API for real-time web data
- Orchestration: function calling (simplest) or LangGraph (if you need loops)
- Memory: SQLite + embeddings for persistence, or a hosted vector DB
Minimal agent in 30 lines
import os, requests, json
from anthropic import Anthropic
client = Anthropic()
SCAVIO_KEY = os.environ["SCAVIO_API_KEY"]
def web_search(query: str) -> str:
resp = requests.post(
"https://api.scavio.dev/api/v1/search",
headers={"x-api-key": SCAVIO_KEY},
json={"query": query, "num_results": 5},
)
results = resp.json().get("organic_results", [])
return "\n".join(f"- {r['title']}: {r['snippet']}" for r in results)
tools = [{
"name": "web_search",
"description": "Search the web for current information",
"input_schema": {
"type": "object",
"properties": {"query": {"type": "string"}},
"required": ["query"],
},
}]
def agent(question: str) -> str:
messages = [{"role": "user", "content": question}]
response = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=1024,
tools=tools,
messages=messages,
)
# Handle tool use
for block in response.content:
if block.type == "tool_use" and block.name == "web_search":
result = web_search(block.input["query"])
messages.append({"role": "assistant", "content": response.content})
messages.append({"role": "user", "content": [{
"type": "tool_result", "tool_use_id": block.id,
"content": result,
}]})
final = client.messages.create(
model="claude-sonnet-4-20250514", max_tokens=1024,
tools=tools, messages=messages,
)
return final.content[0].text
return response.content[0].text
print(agent("What are the cheapest SERP APIs in 2026?"))When to upgrade from raw function calling
Raw function calling (as shown above) works when your agent has a single tool and a linear flow. Upgrade to LangGraph when you need: multi-step reasoning with loops, multiple tools that interact, persistent state across conversations, or conditional branching based on intermediate results.
Starter cost breakdown
- Claude API: ~$3/million input tokens (Sonnet), ~$15/million output
- Scavio search: 250 free credits/mo, $30/mo for 7K credits
- Vector DB: Chroma (free, local) or Pinecone free tier (100K vectors)
- Hosting: Railway or Fly.io free tier for the agent server
- Total for 100 agent calls/day: ~$15-25/month
Adding memory with SQLite
import sqlite3
def init_memory(db_path="agent_memory.db"):
conn = sqlite3.connect(db_path)
conn.execute("""
CREATE TABLE IF NOT EXISTS memories (
id INTEGER PRIMARY KEY AUTOINCREMENT,
user_id TEXT,
content TEXT,
timestamp DATETIME DEFAULT CURRENT_TIMESTAMP
)
""")
conn.commit()
return conn
def store_memory(conn, user_id: str, content: str):
conn.execute("INSERT INTO memories (user_id, content) VALUES (?, ?)",
(user_id, content))
conn.commit()
def recall_memories(conn, user_id: str, limit: int = 10) -> list:
cursor = conn.execute(
"SELECT content FROM memories WHERE user_id = ? ORDER BY timestamp DESC LIMIT ?",
(user_id, limit))
return [row[0] for row in cursor.fetchall()]
# Usage
db = init_memory()
store_memory(db, "user-1", "User prefers Python over JavaScript")
history = recall_memories(db, "user-1")Common beginner mistakes
- Starting with a multi-agent framework before mastering single-agent
- Using expensive LLM models for simple tasks (use Haiku for classification)
- Not adding search grounding (agents hallucinate without it)
- Over-engineering memory (SQLite beats vector DBs for small scale)
- Paying for managed services before validating the use case
Key takeaway
Start with the simplest stack that works: one LLM, one search tool, raw function calling, and SQLite. Build something users actually use before optimizing the architecture. You can always add LangGraph, vector databases, and multi-agent orchestration later when the complexity is justified by real user demand.