Skip to content

feat: 添加内存诊断工具#1686

Closed
Ronifue wants to merge 4 commits into
Mai-with-u:devfrom
Ronifue:dev
Closed

feat: 添加内存诊断工具#1686
Ronifue wants to merge 4 commits into
Mai-with-u:devfrom
Ronifue:dev

Conversation

@Ronifue
Copy link
Copy Markdown
Contributor

@Ronifue Ronifue commented May 13, 2026

  • ✅ 接受:与main直接相关的Bug修复:提交到dev分支
  • 新增功能类pr需要经过issue提前讨论,否则不会被合并
  • 🌐 i18n 提醒:除 bootstrap 或紧急修复外,请不要把非 zh-CN 目标翻译作为常规 GitHub 编辑面;常规翻译以 Crowdin -> l10n_* PR 回流为准,详见 docs/i18n.md

请填写以下内容

(删除掉中括号内的空格,并替换为小写的x

    • main 分支 禁止修改,请确认本次提交的分支 不是 main 分支
    • 我确认我阅读了贡献指南
    • 本次更新类型为:BUG修复
    • 本次更新类型为:功能新增
    • 本次更新是否经过测试
    • 如果本次修改涉及 src/A_memorix,我确认已阅读 src/A_memorix/MODIFICATION_POLICY.md,不涉及则无需勾选
  1. 请填写破坏性更新的具体内容(如有):
  2. 请简要说明本次更新的内容和目的:

其他信息

  • 关联 Issue:Close #

  • 截图/GIF

  • 附加信息:

    新增长时间运行内存诊断工具,用于排查长期运行后的内存占用增长问题。
    诊断任务可通过 [debug] 配置开关启用,定期采集相关状态,并输出 JSONL 快照日志,方便对比趋势和定位异常来源。并用 AI 写了个零基础使用文档。

Summary by CodeRabbit

  • 新功能

    • 可选长期内存诊断:周期性采样进程/子进程、GC、async 任务、会话与二进制估算,支持可选 tracemalloc 差异、阈值告警、JSONL 快照写入与轮转。
  • 文档

    • 新增完整中文用户指南:启用与配置说明、采样建议、输出路径与轮转、分步排查流程、告警阈值与上报模板、字段速查表与注意事项。
  • 测试

    • 新增全面测试覆盖采样逻辑、二进制估算、扫描预算、公平性、快照写入与轮转及容错。
  • 杂项

    • 添加运行时依赖 psutil,配置版本号更新。

Review Change Stack

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 13, 2026

Walkthrough

该 PR 新增长期内存诊断:配置与依赖、在主进程可选注册 MemoryDiagnosticsTask,实装周期快照采集(进程/GC/asyncio/运行时/消息二进制等)、tracemalloc 差分、JSONL 持久化与轮转、估算工具、测试与中文操作指南。

Changes

Memory Diagnostics Service Implementation

Layer / File(s) Summary
Configuration and Dependency Setup
src/config/official_configs.py, src/config/config.py, pyproject.toml, requirements.txt, src/main.py
DebugConfig 添加内存诊断字段;CONFIG_VERSION 提升;新增运行时依赖 psutil>=6.0.0;在 MainSystem._init_components() 条件注册 MemoryDiagnosticsTask
Service entry & run loop
src/services/memory_diagnostics_service.py
新增 MemoryDiagnosticsTask 构造与 run():采集快照、附加 collector 元数据、异步写入 JSONL,并记录摘要与阈值告警。
Core process / GC / asyncio metrics
src/services/memory_diagnostics_service.py
收集进程/子进程内存与句柄/连接(psutil 可选)、Python GC 指标、asyncio 任务清单与“interesting task”统计,以及若干安全访问辅助。
Heartflow / Session binary scanning
src/services/memory_diagnostics_service.py
按全局预算规划会话二进制扫描配额,采样 message_cache 并估算 per-session 二进制/voice/image/emoji,标注截断/跳过与下界。
Estimators & sampling utilities
src/services/memory_diagnostics_service.py
实现受限递归组件二进制估算(含环检测)、_iter_spread 抽样、history-loop 与 cycle_detail 受限深度估算、await-chain 与局部变量扫描等工具。
Chat/WebSocket/Media/A_memorix metrics
src/services/memory_diagnostics_service.py
填充 chat_manager 最近消息估算、websocket 连接/队列统计、image/emoji 媒体任务、memory_automation 与 A_memorix 指标(均为可选模块安全读取)。
tracemalloc baseline & diffing
src/services/memory_diagnostics_service.py
管理 tracemalloc 启停、基线追踪与按增量阈值生成 diff 条目并在需要时重置基线。
Persistence, rotation & helpers
src/services/memory_diagnostics_service.py
快照按行写入 JSONL,解析路径并在总大小阈值上执行轮转与旧文件裁剪;包含进程 cmdline 安全化、字节单位转换与多项安全访问/格式化工具。
Tests
tests/test_memory_diagnostics_service.py
新增测试覆盖:配置默认/重载、二进制估算(直接/嵌套/循环)、扫描预算公平性、抽样覆盖、heartflow 指标收集、collect/run 隔离、JSONL 写入与轮转、以及进程 cmdline 安全化。
User documentation
docs/memory_diagnostics_guide.md
新增 ~692 行中文操作指南,涵盖工具记录范围、启用时机、推荐采样(日常/复现/深挖)、输出与轮转、快速判断流程、常见症状、告警阈值、问题提交流程与字段速查表。

Sequence diagram (高层次交互):

sequenceDiagram
  participant Task as MemoryDiagnosticsTask.run()
  participant Collector as _collect_snapshot()
  participant Heartflow as Heartflow
  participant Tracemalloc as tracemalloc
  participant Writer as _write_snapshot()
  participant JSONL as JSONL file
  participant Logger as Logger

  Task->>Collector: 启动采集
  Collector->>Heartflow: 拉取会话/运行时指标
  Collector->>Tracemalloc: 可选采样与 diff
  Collector-->>Task: 返回快照
  Task->>Writer: 异步写入
  Writer->>JSONL: 追加行并轮转/清理
  Task->>Logger: 输出摘要与告警
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~75 minutes

Possibly related PRs

  • Mai-with-u/MaiBot#1624: 与 src/config/config.pya_memorix.web 表处理相关的配置迁移/重命名逻辑有代码级关联。
🚥 Pre-merge checks | ✅ 3 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 9.09% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Description check ❓ Inconclusive PR 描述虽然基本完成了必填项勾选,但"请简要说明本次更新的内容和目的"部分过于简洁,缺少必要的技术细节说明。 建议补充详细说明:新增工具的主要功能模块、配置项、输出格式,以及与现有系统的集成方式。
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed 标题准确概括了本次提交的主要变更:新增内存诊断工具功能,简洁明了符合规范。
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@docs/memory_diagnostics_guide.md`:
- Line 167: 将文中描述 `top_sessions` 的句子语序优化为更自然的中文:将“`top_sessions`
可以看到占用最高的会话。”改为“`top_sessions` 可以看到占用最高的会话列表。”或改为“`top_sessions`
可查看占用最高的会话”,以避免表述生硬并保持与上下文一致;定位并更新使用 `top_sessions` 的句子即可。
- Line 614: 将字段 chat_manager.last_message_binary_mb
的描述从“最后一条图片、表情、语音被保留”改为更明确的表述,例如“最后一条包含图片/表情/语音的消息被保留”;只修改该字段的自然语言说明,保证语义为“保留最后一条含二进制内容的消息”,并保持表格其它列格式不变。
- Line 238: 一句话摘要:将“最可疑的会话”改为更通顺的表述以优化语句修饰关系。请在文档中将出现的短语 `top_binary_sessions`
对应描述句子“`top_binary_sessions` 会列出二进制占用最可疑的会话。”修改为更自然的表达,例如“`top_binary_sessions`
会列出可疑程度最高的会话”或“`top_binary_sessions`:最可疑会话列表”,确保语义不变且读句通顺。

In `@src/services/memory_diagnostics_service.py`:
- Around line 1009-1018: The diagnostic snapshot currently writes raw child
process command lines via _safe_process_cmdline into the JSONL (seen where the
dict includes "cmdline": _safe_process_cmdline(child)), which can leak secrets;
modify the data collection to sanitize/redact sensitive tokens before persisting
by implementing or reusing a sanitizer (e.g., sanitize_cmdline or
redact_sensitive_parts) and call it either inside _safe_process_cmdline or
immediately before assigning "cmdline"; ensure the sanitizer strips or replaces
common secret patterns (token, password, api_key, --secret=, bearer tokens, long
hex/base64 strings) and apply the same change to the other occurrence mentioned
(around lines with similar dict at 1053-1056) so no raw cmdline values are
written to disk.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 32c0ac01-a735-4587-ab67-7440aa4ef749

📥 Commits

Reviewing files that changed from the base of the PR and between c2b8520 and fad4804.

⛔ Files ignored due to path filters (1)
  • uv.lock is excluded by !**/*.lock
📒 Files selected for processing (8)
  • docs/memory_diagnostics_guide.md
  • pyproject.toml
  • requirements.txt
  • src/config/config.py
  • src/config/official_configs.py
  • src/main.py
  • src/services/memory_diagnostics_service.py
  • tests/test_memory_diagnostics_service.py


- `runtime_count` 持续增长,而且不回落:可能是会话 runtime 没有淘汰。
- `lock_count` 跟着 session 数增长:说明会话创建锁也在累积。
- `top_sessions` 可以看到占用最高的会话。
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

优化语序以提升中文表达自然度。

建议把“占用最高的会话”改为“占用最高的会话”前加“最”,或直接改成“占用最高的会话列表”,避免读感生硬。

🧰 Tools
🪛 LanguageTool

[uncategorized] ~167-~167: 动词的修饰一般为‘形容词(副词)+地+动词’。您的意思是否是:最高"地"会话
Context: ...长:说明会话创建锁也在累积。 - top_sessions 可以看到占用最高的会话。 常见原因: - 很多群聊或私聊陆续触发 MaiSaka。 - 会话长...

(wb4)

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@docs/memory_diagnostics_guide.md` at line 167, 将文中描述 `top_sessions`
的句子语序优化为更自然的中文:将“`top_sessions` 可以看到占用最高的会话。”改为“`top_sessions`
可以看到占用最高的会话列表。”或改为“`top_sessions` 可查看占用最高的会话”,以避免表述生硬并保持与上下文一致;定位并更新使用
`top_sessions` 的句子即可。

Comment thread docs/memory_diagnostics_guide.md
| `chat_manager.loaded` | ChatManager 模块是否已加载 | 未进入消息接收运行态或模块未初始化 |
| `chat_manager.sessions` | ChatManager 当前 session 数 | 聊天 session 对象滞留 |
| `chat_manager.last_messages` | ChatManager 记录的最后消息数量 | 最后一条消息缓存增长 |
| `chat_manager.last_message_binary_mb` | 最后消息缓存中的二进制估算 | 最后一条图片、表情、语音被保留 |
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

该字段说明存在轻微歧义,建议改写。

“最后一条图片、表情、语音被保留”可改为“最后一条包含图片/表情/语音的消息被保留”,避免“最后一条”修饰范围不清。

🧰 Tools
🪛 LanguageTool

[uncategorized] ~614-~614: 数量词修饰并列短语,可能产生歧义
Context: ...message_binary_mb| 最后消息缓存中的二进制估算 | 最后一条图片、表情、语音被保留 | |chat_manager.last_message...

(s5)

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@docs/memory_diagnostics_guide.md` at line 614, 将字段
chat_manager.last_message_binary_mb
的描述从“最后一条图片、表情、语音被保留”改为更明确的表述,例如“最后一条包含图片/表情/语音的消息被保留”;只修改该字段的自然语言说明,保证语义为“保留最后一条含二进制内容的消息”,并保持表格其它列格式不变。

Comment thread src/services/memory_diagnostics_service.py
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (3)
docs/memory_diagnostics_guide.md (3)

238-238: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

调整修饰关系以改善句子通顺度。

"二进制占用最可疑的会话"中的修饰关系略显别扭。建议改为"二进制占用可疑程度最高的会话"或"二进制占用最高的可疑会话"。

📝 建议的表达优化
-- `top_binary_sessions` 会列出二进制占用最可疑的会话。
+- `top_binary_sessions` 会列出二进制占用可疑程度最高的会话。

-- `top_binary_sessions` 会列出二进制占用最可疑的会话。
+- `top_binary_sessions` 会列出二进制占用最高的可疑会话。
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@docs/memory_diagnostics_guide.md` at line 238, 将原文中关于 top_binary_sessions
的描述“二进制占用最可疑的会话”改为更通顺的表述,例如“二进制占用可疑程度最高的会话”(或“二进制占用最高的可疑会话”);在包含
top_binary_sessions 的句子中替换该短语以改善修饰关系并保持语义不变。

614-614: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

消除数量词修饰并列短语的歧义。

"最后一条图片、表情、语音被保留"中,"最后一条"修饰的范围不够明确,可能产生歧义。建议明确表述为"最后一条包含图片/表情/语音的消息被保留"。

📝 建议的表达优化
-| `chat_manager.last_message_binary_mb` | 最后消息缓存中的二进制估算 | 最后一条图片、表情、语音被保留 |
+| `chat_manager.last_message_binary_mb` | 最后消息缓存中的二进制估算 | 最后一条包含图片/表情/语音的消息被保留 |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@docs/memory_diagnostics_guide.md` at line 614, The phrase for
chat_manager.last_message_binary_mb is ambiguous; update the table cell text
"最后一条图片、表情、语音被保留" to an explicit wording such as "最后一条包含图片/表情/语音的消息被保留" so
"最后一条" clearly modifies "消息"; locate the row with the symbol
`chat_manager.last_message_binary_mb` and replace the existing Chinese fragment
accordingly.

167-167: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

优化中文表达以提升清晰度。

"可以看到占用最高的会话"在中文表达中略显生硬。建议改为"可以看到占用最高的会话列表"或"可查看占用最高的会话"。

📝 建议的表达优化
-- `top_sessions` 可以看到占用最高的会话。
+- `top_sessions` 可以看到占用最高的会话列表。

-- `top_sessions` 可以看到占用最高的会话。
+- `top_sessions` 可查看占用最高的会话。
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@docs/memory_diagnostics_guide.md` at line 167, 在文档中对 `top_sessions`
的说明句子措辞不够流畅,请将当前的“`top_sessions` 可以看到占用最高的会话。”替换为更自然的表述,例如“`top_sessions`
可查看占用最高的会话”或“`top_sessions` 可以看到占用最高的会话列表”,以提升可读性并保持对标识符 `top_sessions`
的引用一致;只需在 docs/memory_diagnostics_guide.md 中更新该句即可。
🧹 Nitpick comments (2)
src/services/memory_diagnostics_service.py (2)

1160-1173: 运维建议:interesting task 关键词列表需随系统演进更新。

当前硬编码的关键词列表(learn, description, heartflow, maisaka, memory, websocket, reply_effect, embedding)用于过滤"值得关注的任务"。未来如果新增重要子系统(如新的自动化服务或插件),可能需要更新此列表以确保诊断覆盖范围。

建议在新增核心异步任务时,评估是否需要将相关关键词加入此列表。

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/services/memory_diagnostics_service.py` around lines 1160 - 1173, The
hardcoded keyword list in _is_interesting_task should be made configurable so
operators can update monitored task keywords without changing code; extract the
tuple currently in _is_interesting_task into a centralized, editable source
(e.g., a settings constant or config loader) and have _is_interesting_task read
that list at runtime (fallback to the current defaults if config absent);
reference the function name _is_interesting_task and the current keywords
(learn, description, heartflow, maisaka, memory, websocket, reply_effect,
embedding) so you replace the inline tuple with a call or import (e.g., from
MEMORY_DIAGNOSTICS_KEYWORDS or load_interesting_task_keywords()) that reads from
application config / environment / feature flags and document the config key for
operators to update.

341-352: 💤 Low value

可选:移除未使用的返回值。

该方法声明返回 int 并在 Line 352 返回 scan_count,但调用处(Line 245)未使用返回值。建议改为返回 None 以保持类型签名与实际使用一致。

♻️ 可选的类型签名调整
-    def _fill_runtime_binary_metrics(self, item: dict[str, Any], runtime: Any, scan_budget: int) -> int:
+    def _fill_runtime_binary_metrics(self, item: dict[str, Any], runtime: Any, scan_budget: int) -> None:
         message_cache = getattr(runtime, "message_cache", []) or []
         scan_count = min(len(message_cache), max(0, scan_budget))
         binary_summary = _estimate_messages_binary(_iter_spread(message_cache, scan_count))
@@ -349,7 +349,6 @@
         item["binary_scan_strategy"] = "spread"
         item["binary_scan_skipped_messages"] = max(0, len(message_cache) - scan_count)
         item["binary_lower_bound"] = bool(item["binary_scan_truncated"])
-        return scan_count
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/services/memory_diagnostics_service.py` around lines 341 - 352, 将
_fill_runtime_binary_metrics 的签名和实现从返回 int 改为返回 None:把函数的返回类型注解由 int 改为
None(或移除注解中的返回类型),并删除最后的 `return scan_count`,因为调用者未使用该返回值;保留内部逻辑(scan_count
计算、item 更新等)不变,以确保行为一致。引用符号:函数名 `_fill_runtime_binary_metrics`、局部变量
`scan_count`、字典 `item`。
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Duplicate comments:
In `@docs/memory_diagnostics_guide.md`:
- Line 238: 将原文中关于 top_binary_sessions
的描述“二进制占用最可疑的会话”改为更通顺的表述,例如“二进制占用可疑程度最高的会话”(或“二进制占用最高的可疑会话”);在包含
top_binary_sessions 的句子中替换该短语以改善修饰关系并保持语义不变。
- Line 614: The phrase for chat_manager.last_message_binary_mb is ambiguous;
update the table cell text "最后一条图片、表情、语音被保留" to an explicit wording such as
"最后一条包含图片/表情/语音的消息被保留" so "最后一条" clearly modifies "消息"; locate the row with the
symbol `chat_manager.last_message_binary_mb` and replace the existing Chinese
fragment accordingly.
- Line 167: 在文档中对 `top_sessions` 的说明句子措辞不够流畅,请将当前的“`top_sessions`
可以看到占用最高的会话。”替换为更自然的表述,例如“`top_sessions` 可查看占用最高的会话”或“`top_sessions`
可以看到占用最高的会话列表”,以提升可读性并保持对标识符 `top_sessions` 的引用一致;只需在
docs/memory_diagnostics_guide.md 中更新该句即可。

---

Nitpick comments:
In `@src/services/memory_diagnostics_service.py`:
- Around line 1160-1173: The hardcoded keyword list in _is_interesting_task
should be made configurable so operators can update monitored task keywords
without changing code; extract the tuple currently in _is_interesting_task into
a centralized, editable source (e.g., a settings constant or config loader) and
have _is_interesting_task read that list at runtime (fallback to the current
defaults if config absent); reference the function name _is_interesting_task and
the current keywords (learn, description, heartflow, maisaka, memory, websocket,
reply_effect, embedding) so you replace the inline tuple with a call or import
(e.g., from MEMORY_DIAGNOSTICS_KEYWORDS or load_interesting_task_keywords())
that reads from application config / environment / feature flags and document
the config key for operators to update.
- Around line 341-352: 将 _fill_runtime_binary_metrics 的签名和实现从返回 int 改为返回
None:把函数的返回类型注解由 int 改为 None(或移除注解中的返回类型),并删除最后的 `return
scan_count`,因为调用者未使用该返回值;保留内部逻辑(scan_count 计算、item 更新等)不变,以确保行为一致。引用符号:函数名
`_fill_runtime_binary_metrics`、局部变量 `scan_count`、字典 `item`。

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: f63a2a59-9293-4cfc-815d-60283d01844e

📥 Commits

Reviewing files that changed from the base of the PR and between f230df2 and 6282f10.

⛔ Files ignored due to path filters (1)
  • uv.lock is excluded by !**/*.lock
📒 Files selected for processing (8)
  • docs/memory_diagnostics_guide.md
  • pyproject.toml
  • requirements.txt
  • src/config/config.py
  • src/config/official_configs.py
  • src/main.py
  • src/services/memory_diagnostics_service.py
  • tests/test_memory_diagnostics_service.py
🚧 Files skipped from review as they are similar to previous changes (5)
  • pyproject.toml
  • src/main.py
  • requirements.txt
  • src/config/official_configs.py
  • tests/test_memory_diagnostics_service.py

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
tests/test_memory_diagnostics_service.py (1)

293-299: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

不要把敏感参数明文保留固化为测试预期。

这里把 --token=secret-value 原样作为正确结果,会固化潜在凭据泄露行为(JSONL 落盘、问题提交流程中都可能外泄)。建议将预期改为脱敏后的参数值,并同步调整实现。

🔧 建议修改(示例)
 def test_safe_process_cmdline_keeps_raw_arguments_with_limit() -> None:
     process = SimpleNamespace(cmdline=lambda: ["python.exe", "--token=secret-value", "D:\\private\\script.py"])

     cmdline = diagnostics._safe_process_cmdline(process)

-    assert cmdline == ["python.exe", "--token=secret-value", "D:\\private\\script.py"]
+    assert cmdline == ["python.exe", "--token=***", "D:\\private\\script.py"]
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/test_memory_diagnostics_service.py` around lines 293 - 299, The test
test_safe_process_cmdline_keeps_raw_arguments_with_limit currently asserts the
raw sensitive flag "--token=secret-value"; update the expectation to a sanitized
token value (e.g. "--token=<redacted>" or masked form) and adjust the
implementation in diagnostics._safe_process_cmdline to produce that sanitized
form for sensitive flags/keys (tokens, passwords, keys) instead of returning
secrets verbatim; locate and update the test assertion and ensure
diagnostics._safe_process_cmdline's masking rule covers both "--key=value" and
separate "--key", "value" patterns so the test and function remain consistent.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@tests/test_memory_diagnostics_service.py`:
- Around line 293-299: The test
test_safe_process_cmdline_keeps_raw_arguments_with_limit currently asserts the
raw sensitive flag "--token=secret-value"; update the expectation to a sanitized
token value (e.g. "--token=<redacted>" or masked form) and adjust the
implementation in diagnostics._safe_process_cmdline to produce that sanitized
form for sensitive flags/keys (tokens, passwords, keys) instead of returning
secrets verbatim; locate and update the test assertion and ensure
diagnostics._safe_process_cmdline's masking rule covers both "--key=value" and
separate "--key", "value" patterns so the test and function remain consistent.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: f57d240d-2b6e-4cae-a1b3-5e1da12f8ff3

📥 Commits

Reviewing files that changed from the base of the PR and between 6282f10 and 47f32e4.

⛔ Files ignored due to path filters (1)
  • uv.lock is excluded by !**/*.lock
📒 Files selected for processing (8)
  • docs/memory_diagnostics_guide.md
  • pyproject.toml
  • requirements.txt
  • src/config/config.py
  • src/config/official_configs.py
  • src/main.py
  • src/services/memory_diagnostics_service.py
  • tests/test_memory_diagnostics_service.py
🚧 Files skipped from review as they are similar to previous changes (6)
  • requirements.txt
  • pyproject.toml
  • src/main.py
  • src/config/config.py
  • src/config/official_configs.py
  • src/services/memory_diagnostics_service.py

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant