LangChain Agents Forget Everything Between Sessions

You build a LangChain agent that has a great conversation with a user. It searches for products, compares prices, and makes a recommendation. The user comes back the next day, asks a follow-up question, and the agent has no idea what they are talking about. Every session starts from zero.

This is the default behavior in LangChain. Conversation memory is stored in a Python list that exists only for the duration of the process. Adding persistent memory requires explicit setup, and there are several approaches with different trade-offs.

Why the Default Memory Is Ephemeral

LangChain's ConversationBufferMemory stores messages in an in-memory list. When the process ends -- whether from a deployment, a crash, or a serverless function timeout -- the memory is gone. This is fine for single-session chatbots but breaks any workflow that spans multiple interactions.

Python

from langchain.memory import ConversationBufferMemory

# This memory dies when the process ends
memory = ConversationBufferMemory()
memory.save_context(
    {"input": "Find me running shoes under $100"},
    {"output": "I found 5 options on Amazon..."}
)
# Process restarts -> memory is empty

The framework does not persist memory automatically because there is no one-size-fits-all storage backend. A CLI tool might use a local file. A SaaS product might use PostgreSQL. A serverless agent might use Redis. LangChain leaves this choice to you.

Option 1: Database-Backed Chat History

The most straightforward approach is to store chat messages in a database and reload them at the start of each session. LangChain provides ChatMessageHistory implementations for several backends.

Python

from langchain_community.chat_message_histories import (
    PostgresChatMessageHistory
)
from langchain.memory import ConversationBufferMemory

history = PostgresChatMessageHistory(
    connection_string="postgresql://localhost/agents",
    session_id="user-123-session-456"
)

memory = ConversationBufferMemory(
    chat_memory=history,
    return_messages=True
)

This works but has a scaling problem: as conversations grow, you reload the entire history into the context window. After 50 turns, you are spending most of your tokens on old messages.

Option 2: Summary Memory with Persistence

Instead of storing every message, use ConversationSummaryMemory to compress old messages into a running summary. Persist the summary instead of the raw messages.

Python

from langchain.memory import ConversationSummaryBufferMemory
from langchain_anthropic import ChatAnthropic

llm = ChatAnthropic(model="claude-sonnet-4-20250514")

memory = ConversationSummaryBufferMemory(
    llm=llm,
    max_token_limit=2000,
    return_messages=True
)

# Old messages get summarized automatically
# Persist the summary to your database between sessions

Summary memory reduces token usage but loses detail. The agent remembers that the user was looking for running shoes, but might forget the specific brands they rejected.

Option 3: Vector Store for Long-Term Recall

For agents that need to recall specific facts from past sessions -- not just conversation flow -- a vector store provides semantic search over historical interactions.

Python

from langchain_community.vectorstores import Chroma
from langchain_openai import OpenAIEmbeddings

vectorstore = Chroma(
    collection_name="agent_memory",
    embedding_function=OpenAIEmbeddings(),
    persist_directory="./memory_db"
)

# At the start of each session, retrieve relevant past context
relevant_memories = vectorstore.similarity_search(
    query=user_message,
    k=5,
    filter={"user_id": "user-123"}
)

This approach scales well because you only retrieve memories relevant to the current query. The trade-off is added latency for the embedding lookup and the complexity of managing a vector database.

Practical Recommendations

For most production agents, a hybrid approach works best:

Store the last N messages in a database for short-term continuity
Summarize older conversations and store the summaries for medium-term context
Use a vector store for long-term recall of specific facts and preferences
Include a session_id and user_id on every stored record so you can scope retrieval correctly

The same principle applies to tool results. If your agent searched Scavio for product prices last week, storing those results lets the agent compare prices over time without re-querying. Persistent memory turns a stateless tool-caller into an assistant that genuinely learns from past interactions.

Do Not Overcomplicate It

Start with database-backed chat history. It covers 80% of use cases and is the simplest to implement and debug. Add summary compression when your context window starts filling up. Add vector search only when users explicitly need cross-session recall of specific details. Most agents that "forget" just need a database table, not a sophisticated memory architecture.

Why the Default Memory Is Ephemeral

Python

from langchain.memory import ConversationBufferMemory

# This memory dies when the process ends
memory = ConversationBufferMemory()
memory.save_context(
    {"input": "Find me running shoes under $100"},
    {"output": "I found 5 options on Amazon..."}
)
# Process restarts -> memory is empty

Option 1: Database-Backed Chat History

The most straightforward approach is to store chat messages in a database and reload them at the start of each session. LangChain provides ChatMessageHistory implementations for several backends.

Python

from langchain_community.chat_message_histories import (
    PostgresChatMessageHistory
)
from langchain.memory import ConversationBufferMemory

history = PostgresChatMessageHistory(
    connection_string="postgresql://localhost/agents",
    session_id="user-123-session-456"
)

memory = ConversationBufferMemory(
    chat_memory=history,
    return_messages=True
)

This works but has a scaling problem: as conversations grow, you reload the entire history into the context window. After 50 turns, you are spending most of your tokens on old messages.

Option 2: Summary Memory with Persistence

Instead of storing every message, use ConversationSummaryMemory to compress old messages into a running summary. Persist the summary instead of the raw messages.

Python

from langchain.memory import ConversationSummaryBufferMemory
from langchain_anthropic import ChatAnthropic

llm = ChatAnthropic(model="claude-sonnet-4-20250514")

memory = ConversationSummaryBufferMemory(
    llm=llm,
    max_token_limit=2000,
    return_messages=True
)

# Old messages get summarized automatically
# Persist the summary to your database between sessions

Summary memory reduces token usage but loses detail. The agent remembers that the user was looking for running shoes, but might forget the specific brands they rejected.

Option 3: Vector Store for Long-Term Recall

For agents that need to recall specific facts from past sessions -- not just conversation flow -- a vector store provides semantic search over historical interactions.

Python

from langchain_community.vectorstores import Chroma
from langchain_openai import OpenAIEmbeddings

vectorstore = Chroma(
    collection_name="agent_memory",
    embedding_function=OpenAIEmbeddings(),
    persist_directory="./memory_db"
)

# At the start of each session, retrieve relevant past context
relevant_memories = vectorstore.similarity_search(
    query=user_message,
    k=5,
    filter={"user_id": "user-123"}
)

This approach scales well because you only retrieve memories relevant to the current query. The trade-off is added latency for the embedding lookup and the complexity of managing a vector database.

Practical Recommendations

For most production agents, a hybrid approach works best:

Store the last N messages in a database for short-term continuity
Summarize older conversations and store the summaries for medium-term context
Use a vector store for long-term recall of specific facts and preferences
Include a session_id and user_id on every stored record so you can scope retrieval correctly

LangChain Agents Forget Everything Between Sessions

Why the Default Memory Is Ephemeral

Option 1: Database-Backed Chat History

Option 2: Summary Memory with Persistence

Option 3: Vector Store for Long-Term Recall

Practical Recommendations

Do Not Overcomplicate It

Continue reading

AEO Tracking for D2C Ecommerce Brands in 2026

Agent Discovery vs Extraction: Why Cost Split Matters

LangChain Agents Forget Everything Between Sessions

Why the Default Memory Is Ephemeral

Option 1: Database-Backed Chat History

Option 2: Summary Memory with Persistence

Option 3: Vector Store for Long-Term Recall

Practical Recommendations

Do Not Overcomplicate It

Continue reading

AEO Tracking for D2C Ecommerce Brands in 2026

Agent Discovery vs Extraction: Why Cost Split Matters