mcpclaudememory

Building a Memory Server for Claude with MCP

Building persistent memory for Claude using MCP memory servers -- patterns, pitfalls, and practical implementation.

9 min read

Claude does not remember past conversations. Every session starts from zero -- no knowledge of your preferences, past decisions, or project context. MCP memory servers aim to solve this by giving Claude persistent storage that survives between sessions. The idea is simple: store context in a key-value store, let Claude read and write to it through MCP tools, and retrieve relevant context at the start of each new session.

How MCP Memory Servers Work

A memory MCP server exposes tools for storing and retrieving information. At minimum, it provides store, retrieve, and search operations. The storage backend can be a local JSON file, SQLite database, Redis instance, or a vector database for semantic search.

import { McpServer } from "@modelcontextprotocol/sdk/server/mcp.js";
import { z } from "zod";

const server = new McpServer({ name: "memory-server", version: "1.0.0" });

server.tool(
  "remember",
  "Store a piece of information for later recall",
  {
    key: z.string().describe("A descriptive key for this memory"),
    value: z.string().describe("The information to remember"),
  },
  async ({ key, value }) => {
    await db.set(key, { value, timestamp: Date.now() });
    return { content: [{ type: "text", text: "Stored." }] };
  }
);

server.tool(
  "recall",
  "Retrieve a previously stored memory by key",
  { key: z.string() },
  async ({ key }) => {
    const entry = await db.get(key);
    if (!entry) return { content: [{ type: "text", text: "Not found." }] };
    return { content: [{ type: "text", text: entry.value }] };
  }
);

Storage Backend Options

The choice of storage backend determines the memory server's capabilities and complexity:

  • JSON file: Simplest option. Works for small amounts of structured data. No dependencies. Breaks down with large or concurrent workloads.
  • SQLite: Good balance of simplicity and capability. Handles structured queries, supports full-text search with FTS5. No external service needed.
  • Redis: Fast, supports TTL for automatic expiration, works well for session-scoped memory. Requires running a Redis instance.
  • Vector database: Enables semantic search over memories. Claude can find relevant context even when it does not know the exact key. Adds embedding generation as a dependency.

The Recall Problem

Storing memories is easy. Recalling the right ones is hard. Claude needs to decide which memories to retrieve before it knows what the conversation will be about. Three approaches exist: explicit recall (Claude asks for specific keys), auto-inject (recent memories are added to the system prompt automatically), and semantic search (Claude describes what it needs and the server returns similar memories). Each trades off token cost against recall accuracy.

Patterns That Work

Store decisions, not conversations. Memory servers work best when they store structured decisions -- "we chose PostgreSQL over MySQL because of JSONB support" -- rather than raw conversation transcripts.

Use namespaces. Separate memories by project or context. A memory about your React project's naming convention should not appear when you are working on a Python CLI tool.

Set expiration. Temporary workarounds and draft plans should expire. Use TTL or manual cleanup to prevent memory bloat.

Pitfalls to Avoid

Memory servers introduce failure modes that are easy to overlook. Stale context from reversed decisions leads Claude astray -- memory servers need update and delete operations, not just create and read. Injecting too many memories bloats the context window. Claude may treat stored memories as authoritative even when outdated, so add timestamps. And memory servers can store sensitive information, so secure the storage backend appropriately.

Getting Started

Start with a SQLite-backed memory server and explicit recall. Add semantic search later if explicit recall is not surfacing the right context. Keep memories structured, namespaced, and time-stamped. Delete aggressively -- a small set of accurate memories is more valuable than a large set of stale ones. The MCP ecosystem is still young, so keep the architecture modular enough to swap the memory layer when better options emerge.