meetingsmemoryagents

Meeting Transcript to Agent Memory Pipeline

Extract action items and decisions from meeting transcripts, enrich with web context via search API, store as agent-accessible memory. Full pipeline.

8 min

Meeting data becomes useful agent memory only after splitting into structured buckets: facts mentioned, decisions made, action items with owners and dates, open questions, and stale context. Raw transcript search is useful but treating every sentence as durable truth breaks agents that act on the information.

The problem with raw transcripts

A 60-minute meeting transcript is 8,000-12,000 tokens. Feeding it directly into an agent's context window wastes capacity and includes irrelevant chatter, tentative statements, and contradicted points. The agent cannot distinguish "we decided to use PostgreSQL" from "maybe we should consider PostgreSQL."

Structured extraction pipeline

Python
def extract_meeting_memory(transcript: str) -> dict:
    """Extract structured memory from meeting transcript."""
    prompt = """Extract from this meeting transcript:
    1. FACTS: Verified statements (not opinions or maybes)
    2. DECISIONS: Things explicitly decided with consensus
    3. ACTION_ITEMS: Tasks with owner and deadline
    4. OPEN_QUESTIONS: Unresolved issues needing follow-up
    5. STALE_CONTEXT: Things mentioned as changing or uncertain

    Format as JSON. Only include items with high confidence.
    Mark anything tentative or debated as OPEN_QUESTIONS, not DECISIONS."""

    result = call_llm(prompt + "\n\n" + transcript)
    return json.loads(result)

# Example output:
# {
#   "facts": ["Q1 revenue was $2.3M", "Current team size is 12"],
#   "decisions": ["Switch to PostgreSQL by March",
#                  "Hire 2 more engineers"],
#   "action_items": [
#     {"task": "PostgreSQL migration plan", "owner": "Sarah",
#      "deadline": "2026-02-15"},
#   ],
#   "open_questions": ["Should we use RDS or self-host?"],
#   "stale_context": ["Budget may change after board meeting"]
# }

Making it queryable for agents

Python
import sqlite3, json

db = sqlite3.connect("meeting_memory.db")
db.execute("""CREATE TABLE IF NOT EXISTS memory (
    id INTEGER PRIMARY KEY, meeting_date TEXT,
    category TEXT, content TEXT, owner TEXT,
    deadline TEXT, confidence REAL)""")

def store_memory(meeting_date: str, extracted: dict):
    for category in ["facts", "decisions", "open_questions"]:
        for item in extracted.get(category, []):
            text = item if isinstance(item, str) else json.dumps(item)
            db.execute(
                "INSERT INTO memory (meeting_date, category, content, confidence) VALUES (?,?,?,?)",
                (meeting_date, category, text, 0.9))
    for item in extracted.get("action_items", []):
        db.execute(
            "INSERT INTO memory (meeting_date, category, content, owner, deadline, confidence) VALUES (?,?,?,?,?,?)",
            (meeting_date, "action_item", item["task"],
             item.get("owner"), item.get("deadline"), 0.95))
    db.commit()

Cross-model portability

A structured meeting memory database outlasts any single model's context window and survives model version upgrades. When the next Claude or GPT version drops and the agent stack needs re-onboarding, the meeting memory is already in a portable format that any model can query.

What not to store

  • Casual conversation and greetings (noise)
  • Tentative ideas presented but not agreed on (misleading as facts)
  • Context that will change by next meeting (set TTL on stale items)