深入解析 nanobot 的记忆机制：让 AI 拥有“活”的记忆

引言

在 AI Agent 的长期交互中，记忆管理一直是个痛点。nanobot 的记忆机制建立在一个非常优雅的理念之上：记忆应该是鲜活的，但不应该是混乱的（Memory should feel alive, but it should not feel chaotic）。

好的记忆不应该是一堆笔记的无序堆砌，而是一个安静的注意力系统。它能敏锐地注意到什么值得保留，优雅地放弃不再需要的内容，并将经历转化为平静、持久和有用的知识。本文将结合 nanobot 的实际代码（主要集中在 nanobot/agent/memory.py），深入拆解这套记忆系统的设计与实现。

记忆的分层架构：各司其职的存储介质

nanobot 并没有把所有记忆都塞进一个巨大的文件里，而是根据不同类型记忆的生命周期和用途，将其拆分到了不同的层级和文件中：

短期记忆 (Short-term)：
- 存在于内存中的 session.messages。
- 保存当前正在进行的、鲜活的对话上下文。
中期记忆 (Mid-term)：
- 存储于 memory/history.jsonl。
- 这是一个只追加（Append-only）的运行日志，用于记录被压缩后的前期对话流。
长期记忆 (Long-term)：
- 持久化的 Markdown 知识文件，分为三个维度：
  - SOUL.md：定义 Bot 的人格、行为模式和沟通语气。
  - USER.md：记录用户的身份、偏好和习惯。
  - memory/MEMORY.md：记录项目上下文、长期事实和重要事件。

这种分层设计使得系统在当前对话中保持轻量级和高响应速度，同时在长期使用中具备反思和知识积累的能力。值得一提的是，长期记忆文件由底层的 GitStore 自动进行版本控制，确保记忆的演变可追溯、可回滚。

核心组件与大模型 Prompt 深度解析：从流转到沉淀

整个记忆系统的核心代码位于 nanobot/agent/memory.py 中，主要由三个类构成：MemoryStore（底层存储）、Consolidator（会话压缩）和 Dream（知识提炼）。大模型在其中扮演了关键的“记忆提炼器”角色，其行为由几个精心设计的 Prompt 控制。

1. MemoryStore：纯文件 I/O 与数据结构

MemoryStore 负责底层的文件读写、游标管理和 JSONL 格式化。

关于 history.jsonl 的设计哲学：
在早期的版本中，历史记录存储在 HISTORY.md 中，这虽然适合人类阅读，但作为系统的操作基底（operational substrate）却过于脆弱。nanobot 将其迁移到了 history.jsonl，带来了以下优势：

稳定的增量游标（cursor-based）。
更安全的机器解析和批量处理（easier batching）。
清晰的边界：将“原始历史”与“提炼后的知识”彻底分开。

代码中通过 append_history(entry) 每次生成一个自增的 cursor，并写入带有时间戳的 JSON 记录：

def append_history(self, entry: str) -> int:
    cursor = self._next_cursor()
    ts = datetime.now().strftime("%Y-%m-%d %H:%M")
    record = {"cursor": cursor, "timestamp": ts, "content": strip_think(entry.rstrip()) or entry.rstrip()}
    with open(self.history_file, "a", encoding="utf-8") as f:
        f.write(json.dumps(record, ensure_ascii=False) + "\n")
    # 更新游标状态
    self._cursor_file.write_text(str(cursor), encoding="utf-8")
    return cursor

2. Consolidator：防止上下文爆炸的“压缩器”

随着对话的进行，session.messages 会越来越长，最终对大模型的 Context Window 造成压力。Consolidator 是一个轻量级的、基于 Token 预算触发的合并器。

触发机制：maybe_consolidate_by_tokens 方法会实时估算当前会话的 Token 数。如果超过了安全预算（Context Window - 最大生成 Token - 安全缓冲区），就会触发合并逻辑。
安全切分：它不会生硬地截断对话，而是通过 pick_consolidation_boundary 寻找一个合适的用户发言（User Turn）作为切分边界。

大模型摘要 (Archive Prompt)：将切分出的旧对话通过大模型进行摘要提取，并调用 archive() 将摘要追加到 history.jsonl 中。这里使用的 Prompt (consolidator_archive.md) 非常强调实用性和高优排序：

Extract key facts from this conversation. Only output items matching these categories, skip everything else:
- User facts: personal info, preferences, stated opinions, habits
- Decisions: choices made, conclusions reached
- Solutions: working approaches discovered through trial and error, especially non-obvious methods that succeeded after failed attempts
- Events: plans, deadlines, notable occurrences
- Preferences: communication style, tool preferences

Priority: user corrections and preferences > solutions > decisions > events > environment facts. The most valuable memory prevents the user from having to repeat themselves.

Skip: code patterns derivable from source, git history, or anything already captured in existing memory.

Output as concise bullet points, one fact per line. No preamble, no commentary.
If nothing noteworthy happened, output: (nothing)

解析：

严格过滤：明确指出了需要保留的 5 种事实，其余全部丢弃。
优先级定义：用户的纠正与偏好 > 解决方案 > 决策 > 事件 > 环境事实。核心理念是：“最有价值的记忆是防止用户重复自己的话”。
降噪机制：明确要求跳过可以通过源码或 Git 历史推导的代码模式。

优雅降级：如果 LLM 调用失败，系统会触发 raw_archive 机制，直接将原始消息以 [RAW] 标签 Dump 进历史文件，保证数据不丢失。

3. Dream：从历史中提炼知识的“盗梦空间”

如果说 Consolidator 是为了应对当下的生存压力（Context Window），那么 Dream 就是为了长远的知识沉淀。这是一个重型的、可通过定时任务或手动命令（/dream）触发的记忆处理器。

Dream 类采用了非常工程化的两阶段（Two-phase）处理模式，这反映在它的两个不同的 Prompt 设计上：

阶段一：分析提炼 (Phase 1)

读取自上次 .dream_cursor 以来的 history.jsonl 未处理条目。
将这段历史记录，连同当前的 SOUL.md、USER.md、MEMORY.md 内容一起发给大模型。

分析 Prompt (dream_phase1.md)：

Compare conversation history against current memory files.
Output one line per finding:
[FILE] atomic fact or change description

Files: USER (identity, preferences, habits), SOUL (bot behavior, tone), MEMORY (knowledge, project context, tool patterns)

Rules:
- Only new or conflicting information — skip duplicates and ephemera
- Prefer atomic facts: "has a cat named Luna" not "discussed pet care"
- Corrections: [USER] location is Tokyo, not Osaka
- Also capture confirmed approaches: if the user validated a non-obvious choice, note it

If nothing needs updating: [SKIP] no new information

解析：

原子化原则：强制要求模型输出具体的事实（如 “has a cat named Luna”），而不是宽泛的总结（如 “discussed pet care”）。
变更发现：不仅提取新信息，还要求识别与现有记忆冲突的信息并进行纠正。
固定输出格式：要求以 [FILE] 事实描述 的格式输出，为阶段二的精准编辑提供清晰的指令。

阶段二：增量编辑 (Phase 2)

将阶段一的分析结果交由一个专用的 AgentRunner 处理。
这个 Agent 被赋予了两个核心工具：ReadFileTool 和 EditFileTool。

编辑 Prompt (dream_phase2.md)：

Update memory files based on the analysis below.

## Quality standards
- Every line must carry standalone value — no filler
- Concise bullet points under clear headers
- Remove outdated or contradicted information

## Editing
- File contents provided below — edit directly, no read_file needed
- Batch changes to the same file into one edit_file call
- Surgical edits only — never rewrite entire files
- Do NOT overwrite correct entries — only add, update, or remove
- If nothing to update, stop without calling tools

解析：

外科手术式编辑 (Surgical edits only)：这是最核心的指导原则。它明确禁止 LLM 覆写整个文件，而是要求进行局部增删改。这极大地降低了破坏现有知识结构的风险。
优化工具调用：提示词中告知 LLM 文件内容已经提供，不需要调用 read_file，并且要求将对同一个文件的修改合并到一次 edit_file 调用中，提升了处理效率。

Git 自动版本控制

当 Dream 完成编辑后，不仅会推进 .dream_cursor 游标，还会通过 GitStore.auto_commit() 自动生成一次 Git 提交：

1
2
3

if changelog and self.store.git.is_initialized():
    ts = batch[-1]["timestamp"]
    sha = self.store.git.auto_commit(f"dream: {ts}, {len(changelog)} change(s)")

得益于此，用户可以通过 /dream-log 查看记忆的变更记录，或者通过 /dream-restore <sha> 安全地将记忆回滚到过去的某个状态。

总结与启示

nanobot 的记忆机制完美诠释了“结构与意义的分离”：

history.jsonl 负责结构（记录发生了什么）。
SOUL.md, USER.md, MEMORY.md 负责意义（沉淀留下了什么）。

通过 Consolidator 实现对话流的实时无损压缩，通过 Dream 机制实现异步的、增量的知识提炼，并结合 Git 工具链提供强大的版本控制与容错能力。这套机制避免了传统记忆管理中“上下文臃肿”和“信息被过度覆盖”的顽疾，为构建长期陪伴型 AI Agent 提供了一个极其优秀的工程参考范式。

数字旗手

跟着😺NanoBot学AI智能体设计和开发6：nanobot的记忆长啥样