@QingQ77: 30 个可跑的 Jupyter notebook,把 LLM 智能体的记忆技术从短到长、从简单到生产级全部讲透。 https://github.com/NirDiamant/Agent_Memory_Techniques… 这个仓库把 L…
摘要
一个包含30个可运行Jupyter notebook的GitHub仓库,全面讲解LLM智能体记忆技术,从短期上下文到生产级模式,覆盖MemGPT、Zep、Graphiti等方法,并附有决策树和对比表。
查看缓存全文
缓存时间: 2026/05/08 15:35
30 个可跑的 Jupyter notebook,把 LLM 智能体的记忆技术从短到长、从简单到生产级全部讲透。 https://github.com/NirDiamant/Agent_Memory_Techniques… 这个仓库把 LLM 智能体记忆拆成六个方向:短期上下文怎么管、长期信息怎么存、认知架构怎么搭、检索路由怎么选、现成框架怎么用、评估上线怎么做。 每个方向都有可运行的 notebook,从最基础的对话缓冲区到 MemGPT 自编辑记忆、Zep 时序知识图谱、Graphiti 情景到语义提取,全部有代码。 Anthropic 的七层记忆定义、Mem0 的托管记忆层也覆盖了。还附了决策树和对比表,不知道该用哪个技术时直接查。
NirDiamant/Agent_Memory_Techniques
Source: https://github.com/NirDiamant/Agent_Memory_Techniques
🧠 Agent Memory Techniques
Learn every agent memory technique for LLM agents.
⭐ If you find this useful, please star the repo so more learners can discover it.
🧭 New here? Start with 01 Conversation Buffer Memory or pick a Learning Path. Prefer a visual? See the Decision Tree below. 30 runnable Jupyter notebooks covering conversation buffers, vector stores, knowledge graphs, episodic and semantic memory, working memory, MemGPT, Mem0, Letta, Zep, Graphiti, LoCoMo benchmarks, and production memory patterns.
📖 The RAG Techniques Book is HERE
From the same author
#1 Best Seller on Amazon in Generative AI
Want to go deeper on RAG (Retrieval-Augmented Generation, the technique of giving a model extra documents so it can answer better)? The book is the long version. You’ll get the intuition behind every technique. You’ll get side-by-side comparisons that show when each one wins and when it quietly fails. You’ll get illustrations that make the tricky parts click.
⏳ Launch window only: $0.99
The price goes up once the launch window closes. Readers who grab it now lock in the lowest price it will ever have.
👉 Get the book on Amazon before the price changes
📫 Stay Updated
| 🚀 Weekly Updates |
💡 Expert Insights |
🎯 Top 0.1% Content |
Join over 50,000 readers getting clear AI tutorials every week. Subscribers also get early access and a 33% discount on my book.
💡 Why Agent Memory Matters
💡 Quick Answer (for search engines and skimmers)
Agent memory is the set of techniques that let an LLM-based agent (a system built around a Large Language Model) remember information across turns, sessions, and tasks. Without memory, an agent re-derives context every time and cannot personalize, learn, or maintain coherence over long interactions. This repository documents 30 distinct memory techniques, grouped into six families: short-term context management, long-term storage, cognitive architectures, retrieval and multi-agent patterns, batteries-included frameworks, and production deployment patterns.
Think about a friend who forgets every conversation you’ve ever had. Every morning you’re strangers again. That’s what most AI agents are like today.
Every AI agent eventually hits the same wall: it forgets.
In 2026, AI agents are everywhere. But most of them still forget what you told them yesterday. Without strong memory, an agent can’t keep context across conversations. It can’t learn from past chats. It can’t build a lasting relationship with you.
The landscape is shifting fast:
- Anthropic’s 7 Layers of Memory (March 2026): from conversation context to cross-project knowledge, defining the memory hierarchy for Claude Code
- Mem0: managed memory layer gaining rapid adoption for personalized AI
- Letta (MemGPT): self-editing memory with inner/outer monologue architecture
- Zep: temporal knowledge graphs for long-term agent memory
- Graphiti: episodic-to-semantic knowledge graph extraction
- MemOS & Memori: memory-as-infrastructure platforms for production agents
But there’s no single hands-on guide that teaches you how each technique works, when to use it, and how to build it yourself.
That’s why this repository exists. 30 techniques. Runnable notebooks. Real code you can use today.
🗺️ Taxonomy of Agent Memory Techniques
The 30 techniques fall into six families. Each family solves a different memory problem. Each technique lives in its own notebook.
| Family | What it solves | Techniques |
|---|---|---|
| Short-term | Keep recent turns in memory without filling up the context window. | 01 - 05 |
| Long-term | Save knowledge across sessions, users, and time. | 06 - 11 |
| Cognitive architectures | Working, hierarchical, and reflective memory systems. | 12 - 19 |
| Retrieval & routing | Choose what to recall and when. | 20 - 23 |
| Frameworks | Production-ready memory libraries (Mem0, Letta, Zep, Graphiti). | 24 - 27 |
| Evaluation & production | Measure, benchmark, and deploy memory. | 28 - 30 |
🧭 Which Technique Do I Need?
30 techniques grouped by what you are building. Pick the group that matches your goal, then open the technique inside it.
Quick text version:
- Need to manage the current chat? Start with 01-05 (short-term memory).
- Need to persist across sessions? Start with 06 Vector Store or 21 Cross-Session Memory.
- Building a cognitive architecture with multiple stores? See 12-19.
- Using a framework? Go straight to 24 Graphiti, 25 Mem0, 26 Letta, or 27 Zep.
- Evaluating or shipping to production? See 28-30.
Still not sure? Start with 01 Conversation Buffer. Almost every other technique builds on it.
📐 Compare Techniques at a Glance
Looking to filter by constraint (persistence, retrieval style, token cost, best-for use case)? See the side-by-side comparison matrix covering all 30 techniques in one table.
📚 All 30 Techniques
🔄 Short-Term Memory (Techniques 1-5)
Manage the conversation inside a single chat.
| # | Technique | Description | Notebook |
|---|---|---|---|
| 01 | Conversation Buffer Memory | Save the full conversation, word for word. The simplest pattern, and the base for everything else. | ✅ Notebook · |
| 02 | Sliding Window Memory | Keep only the last few messages. You limit the size, but you keep the recent parts. | ✅ Notebook · |
| 03 | Summary Memory | Replace old turns with a short summary written by the model. You lose length but keep the meaning. | ✅ Notebook · |
| 04 | Summary Buffer Memory | Summarize older turns, but keep recent messages word for word. You get both. | ✅ Notebook · |
| 05 | Token Buffer Memory | Trim the history to fit a strict token budget. Drop the oldest messages first. | ✅ Notebook · |
💾 Long-Term Memory (Techniques 6-11)
Storage that survives across sessions and users.
| # | Technique | Description | Notebook |
|---|---|---|---|
| 06 | Vector Store Memory | Turn past messages into vectors (number lists that capture meaning). Search them later by similarity. | ✅ Notebook · |
| 07 | Entity Memory | Pull out and track facts about people, projects, and preferences. Update them as the conversation grows. | ✅ Notebook · |
| 08 | Knowledge Graph Memory | Build a graph of how entities connect. Walk the graph to reason over what the agent has learned. | ✅ Notebook · |
| 09 | Episodic Memory | Store complete interactions with when-and-where context. Good for “what happened when” questions. | ✅ Notebook · |
| 10 | Semantic Memory | Pull general facts out of interactions. Store them on their own, away from the raw episodes. | ✅ Notebook · |
| 11 | Procedural Memory | Capture “how-to” knowledge: the procedures and workflows the agent picks up over time. | ✅ Notebook · |
🧩 Cognitive Architectures (Techniques 12-19)
Patterns borrowed from how humans remember.
| # | Technique | Description | Notebook |
|---|---|---|---|
| 12 | Working Memory & Context Window | Manage the agent’s limited attention. Prioritize, pin, and evict context on the fly. | ✅ Notebook · |
| 13 | Hierarchical Memory Layers | Tiered storage with hot, warm, and cold layers. Promote and demote items as they age. | ✅ Notebook · |
| 14 | Memory Consolidation | Merge, deduplicate, and strengthen memories. Inspired by how the brain consolidates during sleep. | ✅ Notebook · |
| 15 | Memory Compaction | Compress stored memories with summaries, entity extraction, or distillation. Save storage and tokens. | ✅ Notebook · |
| 16 | Self-Reflection Memory | The agent looks back at its own actions. It writes notes on what worked, and uses them next time. | ✅ Notebook · |
| 17 | Memory Routing | Pick the right memory store to read from or write to. Route by content type and intent. | ✅ Notebook · |
| 18 | Temporal Memory | Attach timestamps to memories. Retrieve with time awareness and weight recent items higher. | ✅ Notebook · |
| 19 | Forgetting & Decay | Forget on purpose. Use decay, access counts, or relevance to prune. | ✅ Notebook · |
🔍 Retrieval & Multi-Agent (Techniques 20-23)
How agents find and share memories.
| # | Technique | Description | Notebook |
|---|---|---|---|
| 20 | Memory Retrieval Patterns | Compare retrieval strategies: semantic search, recency, hybrid scoring, diversity, and re-ranking. | ✅ Notebook · |
| 21 | Cross-Session Memory | Save and reload agent state across sessions. The user picks up where they left off. | ✅ Notebook · |
| 22 | Multi-Agent Shared Memory | Shared stores, message passing, and agreement protocols for multi-agent teams. | ✅ Notebook · |
| 23 | Memory with Tools | Give the agent memory tools it can call: save, search, forget. Treated like any other tool. | ✅ Notebook · |
🔧 Frameworks & Platforms (Techniques 24-27)
Work with the leading memory frameworks, hands-on.
| # | Technique | Description | Notebook |
|---|---|---|---|
| 24 | Graph Memory with Graphiti | Use Zep’s Graphiti to build time-aware knowledge graphs from chat. Extract episodes and general facts. | ✅ Notebook · |
| 25 | Mem0 Patterns | Use Mem0’s managed memory layer. It handles extracting, storing, and fetching user-specific memories. | ✅ Notebook · |
| 26 | Letta (MemGPT) Patterns | Build MemGPT’s self-editing memory. Covers inner monologue, heartbeat events, and memory pressure. | ✅ Notebook · |
| 27 | Zep Memory | Use Zep for dialog classification, entity extraction, and time-aware graphs. Built for production. | ✅ Notebook · |
📊 Evaluation & Production (Techniques 28-30)
Measure your memory. Then ship it.
| # | Technique | Description | Notebook |
|---|---|---|---|
| 28 | Memory Evaluation | Measure memory quality. Check retrieval precision and recall, staleness, contradictions, and user satisfaction. | ✅ Notebook · |
| 29 | Memory Benchmarks (LoCoMo) | Run your memory against LoCoMo and LongMemEval benchmarks. See how it does over long conversations. | ✅ Notebook · |
| 30 | Production Memory Patterns | Run memory at scale. Caching, TTLs (time-to-live), sharding, backups, GDPR, and observability. | ✅ Notebook · |
🎯 Learning Paths
Beginner: Foundations
New to agent memory? Start here. These are the building blocks.
01 Conversation Buffer → 02 Sliding Window → 03 Summary Memory →
05 Token Buffer → 06 Vector Store Memory → 21 Cross-Session Memory
Intermediate: Structured Memory
Ready for more? Add entities, graphs, and smarter retrieval.
07 Entity Memory → 08 Knowledge Graph → 09 Episodic Memory →
10 Semantic Memory → 20 Retrieval Patterns → 22 Multi-Agent Shared Memory
Advanced: Cognitive Architectures
Build human-inspired memory patterns for advanced agents.
12 Working Memory → 13 Hierarchical Layers → 14 Consolidation →
16 Self-Reflection → 17 Memory Routing → 19 Forgetting & Decay
Practitioner: Frameworks & Production
Connect to production tools and measure what you’ve built.
25 Mem0 → 26 Letta/MemGPT → 24 Graphiti → 27 Zep →
28 Evaluation → 29 Benchmarks → 30 Production Patterns
🚀 Quick Start
💡 Prefer not to install anything? Every notebook renders on GitHub directly. Click a technique in the table above to read it in your browser. Or use the Colab badges to run it in the cloud.
# Clone the repository
git clone https://github.com/NirDiamant/Agent_Memory_Techniques.git
cd Agent_Memory_Techniques
# Create a virtual environment
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
# Install dependencies
pip install -r requirements.txt
# Set up your API keys
cp .env.example .env
# Edit .env with your OPENAI_API_KEY and/or ANTHROPIC_API_KEY
# Launch Jupyter and start with the first technique
jupyter notebook all_techniques/01_conversation_buffer_memory/
📁 Project Structure
Agent_Memory_Techniques/
├── README.md # You are here
├── ROADMAP.md # Current state and what's next
├── LICENSE # Apache 2.0
├── CITATION.cff # How to cite this work
├── requirements.txt # Python dependencies
├── .env.example # API key template
├── llms.txt # LLM-discoverability index
│
├── all_techniques/ # 30 technique folders, each with notebook + README
│ ├── 01_conversation_buffer_memory/
│ ├── 02_sliding_window_memory/
│ ├── ...
│ └── 30_production_memory_patterns/
│
├── docs/ # Project documentation
│ ├── architecture.md # Memory system design patterns
│ ├── comparison.md # Side-by-side comparison of all 30 techniques
│ ├── glossary.md # Key terms and definitions
│ ├── learning_path.md # Detailed learning path guide
│ ├── topics.md # Keyword index
│ ├── roadmap.md # Original planning archive
│ ├── FAQ.md # Frequently asked questions
│ └── CONTENT_STANDARDS.md # Writing-style rules
│
├── .github/ # GitHub community files
│ ├── CONTRIBUTING.md # How to contribute
│ ├── CODE_OF_CONDUCT.md # Community guidelines
│ ├── SECURITY.md # Security policy
│ ├── FUNDING.yml # Sponsorship config
│ ├── ISSUE_TEMPLATE/ # Issue templates
│ ├── pull_request_template.md # PR template
│ └── workflows/ # CI workflows
│
├── utils/ # Shared helpers and validators
│ ├── helpers.py # Env loading, LLM clients, cosine, tokens
│ ├── validate_cells.py # Notebook cell-structure validator
│ └── validate_style.py # Prose-style validator
│
├── tests/ # pytest smoke tests
├── data/ # Small sample datasets
└── images/ # Diagrams and visuals
🤝 Contributing
We welcome contributions. You can fill in a notebook, fix a bug, improve the docs, or propose a new technique. Every contribution helps the next reader.
See CONTRIBUTING.md for the details.
Where we need help the most:
- More techniques we haven’t covered yet (propose one via an issue)
- Architecture diagrams (Mermaid or ASCII)
- More memory benchmarks and evaluation metrics
- Integration examples for new frameworks
💖 Sponsors
Supporting this project helps keep educational AI content free and open. If your company uses agent memory in production, consider sponsoring to get your logo below.
🔗 Related Work
This repo is part of a bigger collection of AI technique tutorials.
| Repository | Stars | Focus |
|---|---|---|
| RAG Techniques | 26k+ | Retrieval-Augmented Generation techniques |
| GenAI Agents | 21k+ | Generative AI agent architectures |
| Agents Towards Production | 18k+ | Production-grade agent deployment |
| Prompt Engineering | 7k+ | Prompt engineering techniques |
🏷️ Topics Covered
This repository is a practical reference for agent memory in Large Language Model (LLM) applications. For the full keyword index covering short-term, long-term, cognitive architectures, retrieval, frameworks, evaluation, and production patterns, see docs/topics.md.
⚠️ Disclaimer
This repository is for educational purposes. The code here shows how agent memory techniques work. It is not production-ready software. Do not use it as-is for handling regulated data, medical decisions, legal advice, or any high-stakes application without a careful review. The authors accept no responsibility for how you use this material.
📄 License
This project is licensed under the Apache License 2.0. See the LICENSE file for details.
📖 Citation
If you use this repository in your research or teaching, please cite:
@misc{diamant2026agentmemory,
title={Agent Memory Techniques: A Comprehensive Collection},
author={Nir Diamant},
year={2026},
url={https://github.com/NirDiamant/Agent_Memory_Techniques
}
Built with care by Nir Diamant, making advanced AI accessible to everyone.
相似文章
@seclink: ReasoningBank 是一套新颖的智能体记忆框架,使大型语言模型(LLM)智能体能够持续从成功与失败的经验中进行学习。 我们的评估结果表明,该框架能够提升智能体的效能,显著提高其任务成功率与工作效率。
ReasoningBank introduces an agent memory framework that lets LLM agents continuously learn from successes and failures, boosting task success rates and efficiency.
rohitg00/agentmemory
agentmemory 是一个开源的持久化记忆层,专为 AI 编程智能体(Claude Code、Cursor、Gemini CLI、Codex CLI 等)设计。它通过知识图谱、置信度评分和混合搜索技术,借助 MCP、Hooks 或 REST API,为智能体提供跨会话的长期记忆能力。该项目基于 iii 引擎构建,无需外部数据库,提供 51 个 MCP 工具。
Mem0:利用可扩展的长期记忆构建生产就绪的 AI 智能体
Mem0 引入了一种基于图表示的可扩展内存中心架构,旨在提升大语言模型(LLM)在长期对话中的连贯性,在显著降低延迟和 Token 成本的同时,性能优于现有的记忆系统。
@QingQ77: 《动手学深度学习》是很好的入门书,但更新速度已经有些跟不上这个领域的发展。Transformer 之后,CLIP、Diffusion、vLLM 等等内容越来越多,网上资料虽然丰富,却很零散,今天看 Attention,明天学 LoRA,后…
该项目是一个系统化的深度学习笔记仓库,涵盖 PyTorch、Transformer、生成模型等内容,旨在解决学习资料碎片化问题,并提供代码实现与实践指南。
@Av1dlive:OpenAI 工程师关于 Agent 记忆的 26 分钟演讲,能让你在真正掌握构建 Agent 记忆的方法方面,收获远超独自摸索数月所得……
OpenAI 工程师带来一场 26 分钟的分享,探讨如何为 AI Agent 搭建高效的记忆系统,为 Agent 架构开发者提供极具价值的实战洞察。

