LangChain Agents Forget Everything Between Sessions
How to add persistent memory to LangChain agents so they retain context between sessions and conversations.
You build a LangChain agent that has a great conversation with a user. It searches for products, compares prices, and makes a recommendation. The user comes back the next day, asks a follow-up question, and the agent has no idea what they are talking about. Every session starts from zero.
This is the default behavior in LangChain. Conversation memory is stored in a Python list that exists only for the duration of the process. Adding persistent memory requires explicit setup, and there are several approaches with different trade-offs.
Why the Default Memory Is Ephemeral
LangChain's ConversationBufferMemory stores messages in an in-memory list. When the process ends -- whether from a deployment, a crash, or a serverless function timeout -- the memory is gone. This is fine for single-session chatbots but breaks any workflow that spans multiple interactions.
from langchain.memory import ConversationBufferMemory
# This memory dies when the process ends
memory = ConversationBufferMemory()
memory.save_context(
{"input": "Find me running shoes under $100"},
{"output": "I found 5 options on Amazon..."}
)
# Process restarts -> memory is emptyThe framework does not persist memory automatically because there is no one-size-fits-all storage backend. A CLI tool might use a local file. A SaaS product might use PostgreSQL. A serverless agent might use Redis. LangChain leaves this choice to you.
Option 1: Database-Backed Chat History
The most straightforward approach is to store chat messages in a database and reload them at the start of each session. LangChain provides ChatMessageHistory implementations for several backends.
from langchain_community.chat_message_histories import (
PostgresChatMessageHistory
)
from langchain.memory import ConversationBufferMemory
history = PostgresChatMessageHistory(
connection_string="postgresql://localhost/agents",
session_id="user-123-session-456"
)
memory = ConversationBufferMemory(
chat_memory=history,
return_messages=True
)This works but has a scaling problem: as conversations grow, you reload the entire history into the context window. After 50 turns, you are spending most of your tokens on old messages.
Option 2: Summary Memory with Persistence
Instead of storing every message, use ConversationSummaryMemory to compress old messages into a running summary. Persist the summary instead of the raw messages.
from langchain.memory import ConversationSummaryBufferMemory
from langchain_anthropic import ChatAnthropic
llm = ChatAnthropic(model="claude-sonnet-4-20250514")
memory = ConversationSummaryBufferMemory(
llm=llm,
max_token_limit=2000,
return_messages=True
)
# Old messages get summarized automatically
# Persist the summary to your database between sessionsSummary memory reduces token usage but loses detail. The agent remembers that the user was looking for running shoes, but might forget the specific brands they rejected.
Option 3: Vector Store for Long-Term Recall
For agents that need to recall specific facts from past sessions -- not just conversation flow -- a vector store provides semantic search over historical interactions.
from langchain_community.vectorstores import Chroma
from langchain_openai import OpenAIEmbeddings
vectorstore = Chroma(
collection_name="agent_memory",
embedding_function=OpenAIEmbeddings(),
persist_directory="./memory_db"
)
# At the start of each session, retrieve relevant past context
relevant_memories = vectorstore.similarity_search(
query=user_message,
k=5,
filter={"user_id": "user-123"}
)This approach scales well because you only retrieve memories relevant to the current query. The trade-off is added latency for the embedding lookup and the complexity of managing a vector database.
Practical Recommendations
For most production agents, a hybrid approach works best:
- Store the last N messages in a database for short-term continuity
- Summarize older conversations and store the summaries for medium-term context
- Use a vector store for long-term recall of specific facts and preferences
- Include a
session_idanduser_idon every stored record so you can scope retrieval correctly
The same principle applies to tool results. If your agent searched Scavio for product prices last week, storing those results lets the agent compare prices over time without re-querying. Persistent memory turns a stateless tool-caller into an assistant that genuinely learns from past interactions.
Do Not Overcomplicate It
Start with database-backed chat history. It covers 80% of use cases and is the simplest to implement and debug. Add summary compression when your context window starts filling up. Add vector search only when users explicitly need cross-session recall of specific details. Most agents that "forget" just need a database table, not a sophisticated memory architecture.