Measure Zero


  • About

  • Quotes

  • Notes

  • Sitemap

  • Search

读 Claude Code 源码 - memory 机制

2026-04-03 | ~ | Machine Learning

分为 Session Memory和 Auto Memory (跨 session).

Session Memory

每个 session 维护一个 summary.md 文件, 后台用 forked subagent (并且限制权限只能编辑这个 summary.md 文件) 达到条件 (距离上次更新后新增 token 数以及 tool calls 数等) 后更新.

/**
 * Session Memory automatically maintains a markdown file with notes about the current conversation.
 * It runs periodically in the background using a forked subagent to extract key information
 * without interrupting the main conversation flow.
 */
/**
 * Configuration for session memory extraction thresholds
 */
export type SessionMemoryConfig = {
  /** Minimum context window tokens before initializing session memory.
   * Uses the same token counting as autocompact (input + output + cache tokens)
   * to ensure consistent behavior between the two features. */
  minimumMessageTokensToInit: number
  /** Minimum context window growth (in tokens) between session memory updates.
   * Uses the same token counting as autocompact (tokenCountWithEstimation)
   * to measure actual context growth, not cumulative API usage. */
  minimumTokensBetweenUpdate: number
  /** Number of tool calls between session memory updates */
  toolCallsBetweenUpdates: number
}
export const DEFAULT_SESSION_MEMORY_CONFIG: SessionMemoryConfig = {
  minimumMessageTokensToInit: 10000,
  minimumTokensBetweenUpdate: 5000,
  toolCallsBetweenUpdates: 3,
}
  // Trigger extraction when:
  // 1. Both thresholds are met (tokens AND tool calls), OR
  // 2. No tool calls in last turn AND token threshold is met
  //    (to ensure we extract at natural conversation breaks)
  //
  // IMPORTANT: The token threshold (minimumTokensBetweenUpdate) is ALWAYS required.
  // Even if the tool call threshold is met, extraction won't happen until the
  // token threshold is also satisfied. This prevents excessive extractions.
/**
 * Returns the session memory directory path for the current session with trailing separator.
 * Path format: {projectDir}/{sessionId}/session-memory/
 */
export function getSessionMemoryDir(): string {
  return join(getProjectDir(getCwd()), getSessionId(), 'session-memory') + sep
}

/**
 * Returns the session memory file path for the current session.
 * Path format: {projectDir}/{sessionId}/session-memory/summary.md
 */
export function getSessionMemoryPath(): string {
  return join(getSessionMemoryDir(), 'summary.md')
}

Prompt 模板:

# Session Title
_A short and distinctive 5-10 word descriptive title for the session. Super info dense, no filler_

# Current State
_What is actively being worked on right now? Pending tasks not yet completed. Immediate next steps._

# Task specification
_What did the user ask to build? Any design decisions or other explanatory context_

后面还有:

  • Files and Functions
  • Workflow
  • Errors & Corrections
  • Codebase and System Documentation
  • Learnings
  • Key results
  • Worklog
- The file must maintain its exact structure with all sections, headers, and italic descriptions intact
-- NEVER modify, delete, or add section headers
-- NEVER modify or delete the italic _section description_ lines
-- ONLY update the actual content that appears BELOW the italic _section descriptions_

更新动作由一个 forked subagent 执行.

  // Run session memory extraction using runForkedAgent for prompt caching
  // runForkedAgent creates an isolated context to prevent mutation of parent state
  // Pass setupContext.readFileState so the forked agent can edit the memory file
/**
 * Creates a canUseTool function that only allows Edit for the exact memory file.
 */
if (
  tool.name === FILE_EDIT_TOOL_NAME &&
  typeof input === 'object' &&
  input !== null &&
  'file_path' in input
) {
  const filePath = input.file_path
  if (typeof filePath === 'string' && filePath === memoryPath) {
    return { behavior: 'allow' as const, updatedInput: input }
  }
}

用途: 给 autocompact 提前备料

一旦满足 autocompact 条件, 先试 session memory compaction.

  // EXPERIMENT: Try session memory compaction first
  const sessionMemoryResult = await trySessionMemoryCompaction(
    messages,
    toolUseContext.agentId,
    recompactionInfo.autoCompactThreshold,
  )
/**
 * Try to use session memory for compaction instead of traditional compaction.
 * Returns null if session memory compaction cannot be used.
 *
 * Handles two scenarios:
 * 1. Normal case: lastSummarizedMessageId is set, keep only messages after that ID
 * 2. Resumed session: lastSummarizedMessageId is not set but session memory has content,
 *    keep all messages but use session memory as the summary
 */

保留最近一段原始消息

/**
 * Calculate the starting index for messages to keep after compaction.
 * Starts from lastSummarizedMessageId, then expands backwards to meet minimums:
 * - At least config.minTokens tokens
 * - At least config.minTextBlockMessages messages with text blocks
 * Stops expanding if config.maxTokens is reached.
 * Also ensures tool_use/tool_result pairs are not split.
 */
export const DEFAULT_SM_COMPACT_CONFIG: SessionMemoryCompactConfig = {
  minTokens: 10_000,
  minTextBlockMessages: 5,
  maxTokens: 40_000,
}

避免切断 tool use 和 tool result 配对.

 * Adjust the start index to ensure we don't split tool_use/tool_result pairs
 * or thinking blocks that share the same message.id with kept assistant messages.
 *
 * If ANY message we're keeping contains tool_result blocks, we need to
 * include the preceding assistant message(s) that contain the matching tool_use blocks.
 * API error: orphan tool_result references non-existent tool_use

预算控制, 避免 summary.md 本身太大.

const MAX_SECTION_LENGTH = 2000
const MAX_TOTAL_SESSION_MEMORY_TOKENS = 12000

更新 prompt 里明确提醒

- Keep each section under ~${MAX_SECTION_LENGTH} tokens/words
- IMPORTANT: Always update "Current State" to reflect the most recent work

如果总量已经超了, 追加压缩提醒

CRITICAL: The session memory file is currently ~${totalTokens} tokens, which exceeds the maximum of ${MAX_TOTAL_SESSION_MEMORY_TOKENS} tokens. You MUST condense the file to fit within this budget.

进入 compact 流程时, 也会再做一次截断保护

/**
 * Truncate session memory sections that exceed the per-section token limit.
 * Used when inserting session memory into compact messages to prevent
 * oversized session memory from consuming the entire post-compact token budget.
 */

另一个用途: 给 skillify 提供会话背景

把当前 session 炼成 skill.

## Your Session Context

Here is the session memory summary:
<session_memory>

</session_memory>
const sessionMemory =
  (await getSessionMemoryContent()) ?? 'No session memory available.'

然后它再把本次 session 的 user messages 补进去

const userMessages = extractUserMessages(
  getMessagesAfterCompactBoundary(context.messages),
)

Auto Memory

需要打开 feature, 见 文档. 跨 session, 支持召回.

/**
 * Whether auto-memory features are enabled (memdir, agent memory, past session search).
 */

MEMORY.md 是入口索引

const AUTO_MEM_DIRNAME = 'memory'
const AUTO_MEM_ENTRYPOINT_NAME = 'MEMORY.md'

默认目录解析

/**
 * Returns the auto-memory directory path.
 *
 * Resolution order:
 *   1. CLAUDE_COWORK_MEMORY_PATH_OVERRIDE env var
 *   2. autoMemoryDirectory in settings.json
 *   3. <memoryBase>/projects/<sanitized-git-root>/memory/
 */

这里的 MEMORY.md 负责做入口索引, durable memory 写进各自的 topic file.

`MEMORY.md` is an index, not a memory
each entry should be one line, under ~150 characters
Saving a memory is a two-step process:

**Step 1** — write the memory to its own file
**Step 2** — add a pointer to that file in `MEMORY.md`
/**
 * Extracts durable memories from the current session transcript
 * and writes them to the auto-memory directory (~/.claude/projects/<path>/memory/).
 *
 * It runs once at the end of each complete query loop
 */

MEMORY.md 会被自动注入上下文

" (user's auto-memory, persists across conversations)"

支持按 query 检索 (通过读 topic files 的 frontmatter) 相关 memories

/**
 * Find memory files relevant to a query by scanning memory file headers
 * and asking Sonnet to select the most relevant ones.
 *
 * Returns absolute file paths + mtime of the most relevant memories
 * (up to 5). Excludes MEMORY.md (already loaded in system prompt).
 */
Read more »

读 Claude Code 源码 - 上下文压缩策略

2026-04-01 | ~ | Machine Learning

若干层压缩.

Read more »

Agent 实践杂录

2025-10-17 | ~ | Machine Learning

2025 年大家都忙着搞 agent. 下面分类是随便分的.

  • antirez. 2026-01. Don’t fall into the anti-AI hype
Read more »

RAG 简要回顾

2025-10-07 | ~ | Machine Learning

2025 年大家都忙着开发 agent, 这里简要回顾一下 RAG.

RAG 基本操作

  • Offline: 文件解析, 文本切片, embedding (以前通常用 bge)
  • 对 query embedding 后做召回 (通常就算个 cos, chunk 量大时用向量数据库牺牲一定精度加速召回)
  • Rerank (通常是 bge-reranker)

这套早在 2023 年就玩烂了.

  • 基本的 “进阶” 操作可见 NisaarAgharia/Advanced_RAG, 以及 NirDiamant/RAG_Techniques
  • 这是一篇很好的综述: 【同济大学 王昊奋】Agentic RAG 时代
  • 另外可以参考一些 字节跳动 RAG 实践手册 将 RAG 分为数据层, 索引层, 检索层, 生成层.
Read more »

读代码: Cherry Studio 联网搜

2025-09-30 | ~ | Machine Learning

非常粗糙.

如果同时开启知识库和联网搜 (searchOrchestrationPlugin.ts), 则用 SEARCH_SUMMARY_PROMPT 做意图分析和 query 改写. 简单地把两种搜索的结果拼接起来 (不会混起来重排), index 加上偏移量避免重叠. 如果设置了召回 memory 也会拼在后面.

联网搜分为两种:

  • 一种是 local search (见 LocalSearchProvider.ts), 直接解析 SERP (比如 https://www.google.com/search?q=%s). 免费.
  • 另一种就是调搜索 API, 比如 Tavily.

访问搜索引擎以及 fetch url 内容都是通过 Electron 在后台打开不可见的浏览器窗口加载指定的 url.

window.api.searchService.openUrlInSearchWindow(uid, url)

类似白嫖搜索引擎的项目还有比如 duckduckgo-mcp-server 以及 open-webSearch. 不清楚是否合规.

Read more »

用 Pydantic 自动生成 LLM Tool Schema

2025-09-14 | ~ | Machine Learning

简单小工具.

定义 tool 参数后, 不引入其他库, 仅用 Pydantic 自动生成符合 OpenAI 规范的 Tool Schema. 想法很简单, 把 Pydantic 的 model_json_schema 生成的 JSON Schema 处理成 OpenAI 规范即可.

好处是 (1) 不用引入或依赖其他乱七八糟的库; (2) 不用手动额外维护一套工具描述; (3) 能利用 Pydantic 的一些功能, 从 JSON string load 之后自动校验参数, 自动转换类型等.

Read more »

难倒各路大模型的两道简单 SQLite 问题

2025-05-05 | ~ | Tech

问题描述以及示例 prompt 如下


你是 SQLite 专家, 请完成下面两个问题.

  1. 用 SQLite 写一个 query, 根据 “now” 获得当地时间今日零点的 unix 时间戳. 注: “当地” 指执行 SQL 的机器的系统时区, “今日” 指当地日期的今日.

例: 若 now 为 ‘2025-05-05 04:00:00+08:00’, 则返回 ‘2025-05-05 00:00:00+08:00’. (假设当地时区为 UTC+8)

  1. 用 SQLite 写一个 query, 根据 “now” 获得上周的周一的日期. 假设周一为一周的开始, 全程只在 UTC 时间考虑问题 (不用考虑时区).

例: 若 now 为 ‘2025-05-05’ 周一, 则返回 ‘2025-04-28’. 若 now 为 ‘2025-05-04’ 周日, 则返回 ‘2025-04-21’.

Read more »

LightRAG 源码简要分享

2025-01-21 | ~ | Machine Learning

Guo, Z., Xia, L., Yu, Y., Ao, T., & Huang, C. (2024). Lightrag: Simple and fast retrieval-augmented generation.

大体流程:

  • 用 LLM 提取 chunks 中的实体和关系, 并存成一个图
  • 用 LLM 从 query 中提取关键词, 根据关键词召回实体或关系, 再找到最相关的 chunks, 最后把所有东西都拼起来给 LLM 输出答案
Read more »

ModernBERT

2024-12-24 | ~ | Machine Learning
  • Warner, B., Chaffin, A., Clavié, B., Weller, O., Hallström, O., Taghadouini, S., … & Poli, I. (2024). Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference. arXiv preprint arXiv:2412.13663.
  • 2024-12-19 Hugging Face Finally, a Replacement for BERT

如同字面意思, 更现代的 BERT, 更快更强而且 context length 拓展到 8k tokens, 也是首个在训练数据中加入大量代码数据的 encoder-only 模型. BERT 系模型对比 LLM 的优势是快, 便宜, 而且很多任务适用 encoder-only 结构.

Read more »

LoRA 变体

2024-08-18 | ~ | Machine Learning
Read more »
1 2 … 18
Shiina

Shiina

知乎 豆瓣 bangumi Instagram Weibo
Creative Commons
RSS
© 2019 - 2026   Shiina   CC BY-NC-ND 4.0
RSS  
Powered by Jekyll
 
Theme NexT.Mist