伴学 Batch A 性能优化与稳定性加固#1598
Conversation
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Repository UI Review profile: CHILL Plan: Pro Plus Run ID: 📒 Files selected for processing (2)
🚧 Files skipped from review as they are similar to previous changes (2)
Walkthrough此 PR 将 StudyCompanion 的事件总线改为有界队列+单 worker,增加读写分离的线程本地只读连接与批量答案事务,KnowledgeTracker 引入内存主题索引与批量候选写入,OCR 管道加入线程池并行处理,插件生命周期完善资源清理与 review_due future 管理,并大幅扩展单元测试覆盖喵。 Changes事件总线队列化改造
数据库连接分层与批量写入
只读查询迁移
知识追踪批量写入
OCR 管道线程池与并行处理
插件生命周期与 review_due 管理
数据安全与载荷拷贝
单元测试覆盖扩展
🎯 Estimated Code Review Effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly Related PRs
Poem
🚥 Pre-merge checks | ✅ 4✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 7
🧹 Nitpick comments (5)
plugin/tests/unit/plugins/test_study_event_bus.py (1)
584-613: 💤 Low value测试逻辑正确,覆盖了 backlog 满时丢弃最旧事件的边界情况喵~
不过第 609 行访问了
bus._queue._queue内部 deque,这依赖 CPython 实现细节喵。虽然在单元测试里这样写可以接受,但如果将来 asyncio.Queue 内部实现变了可能会脆弱喵。🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@plugin/tests/unit/plugins/test_study_event_bus.py` around lines 584 - 613, The test currently peeks into the private deque via bus._queue._queue which relies on CPython internals; instead, drain the queue using public APIs then restore it to preserve state: loop while not bus._queue.empty() and call bus._queue.get_nowait() to collect items into a list, perform your assertions on that list, then put the items back into bus._queue in the same order with put_nowait so the worker state is unchanged; update test_schedule_emit_drops_when_backlog_is_full to use this drain-and-restore approach rather than accessing bus._queue._queue directly.plugin/plugins/study_companion/service.py (1)
33-72: ⚡ Quick win
json_copy与deepcopy的行为差异比预期小,但策略不一致需整理喵~检查发现
json_copy并不是真的 JSON 往返(dumps→loads),只是递归复制 dict/list/tuple、规范化 dict 键为 str、保留其他类型不变。所以它并不会把 datetime 或自定义对象强制转成字符串喵。从
StudyState类型标注看,这些状态字段都是list/dict/float/str之类的 JSON 友好类型,不太可能在运行时塞进 tuple 或非 str 的 dict 键。因此从json_copy换成deepcopy实际风险比初期担心的小得多呢~不过 Line 200 的
build_tutor_payload还在用json_copy,而build_status_payload改用deepcopy了,这策略不一致有点别扭喵。建议两边对齐一下,要么都改成deepcopy,要么一起用json_copy,这样更清晰也更易维护呢~🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@plugin/plugins/study_companion/service.py` around lines 33 - 72, The diff shows build_status_payload using copy.deepcopy while build_tutor_payload still uses json_copy, causing inconsistent copying semantics; pick one strategy and apply it consistently. Decide whether to prefer deepcopy or json_copy (given StudyState fields are JSON-friendly deepcopy is acceptable), then replace calls to json_copy in build_tutor_payload with copy.deepcopy (or vice versa replace deepcopy uses with json_copy) and update imports/usages accordingly (look for functions build_status_payload, build_tutor_payload, and the json_copy symbol) so both payload builders use the same copy function across the file.plugin/tests/unit/plugins/test_study_companion_knowledge_tracker.py (1)
295-338: 💤 Low value这里和
_store帮手重复造轮子了喵~为了塞一个自定义
logger,你把 seed 路径拼接和StudyStore构造又抄了一遍(第 300-308 行),跟_store(第 36-46 行)几乎一模一样喵。让_store接受可选logger参数就能复用啦,笨蛋别嫌麻烦喵~♻️ 让
_store接受 logger 的小改造喵-def _store(tmp_path: Path) -> StudyStore: +def _store(tmp_path: Path, logger: object | None = None) -> StudyStore: seed = ( Path(__file__).resolve().parents[3] / "plugins" / "study_companion" / "static" / "knowledge_graph_seed.json" ) - store = StudyStore(tmp_path / "study.db", tmp_path / "seed.json", _Logger(), seed) + store = StudyStore( + tmp_path / "study.db", tmp_path / "seed.json", logger or _Logger(), seed + ) store.open() return store然后这个测试里直接
store = _store(tmp_path, logger)就好了喵。🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@plugin/tests/unit/plugins/test_study_companion_knowledge_tracker.py` around lines 295 - 338, The test test_study_store_batch_write_answer_data_keeps_answer_when_candidates_fail duplicates seed path construction and StudyStore instantiation that already exists in the helper function _store; modify _store to accept an optional logger parameter (e.g., def _store(tmp_path: Path, logger: Optional[Logger]=None) or similar) and use it when constructing StudyStore, then update this test to call store = _store(tmp_path, logger) instead of reassembling the seed path and calling StudyStore directly; ensure the unique symbols referenced are _store and StudyStore so callers keep the same behavior while allowing injection of the custom logger.plugin/tests/unit/plugins/test_study_companion.py (1)
3884-3895: ⚡ Quick win把 worker 任务对象的终态也断言出来喵。
这里只校验
bus._worker_task is None,如果shutdown()以后只是把字段清空、但没有真的等这个task结束,这条回归测试还是会误过喵。上面已经拿到了task,这里最好再补一条assert task.done(),必要的话再细化到cancelled()/done(),这样才能把“worker 确实停掉了”锁死喵。可参考的小改动喵
await plugin.shutdown() assert bus._worker_task is None + assert task.done()🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@plugin/tests/unit/plugins/test_study_companion.py` around lines 3884 - 3895, 在测试中除了断言 bus._worker_task 为 None 之外,还需要断言之前保存的 task 对象已经结束以确保 worker 实际停止;在现有代码片段里对变量 task(由 bus.schedule_emit 返回的任务)添加断言例如 assert task.done(),如需更严格可改为断言 task.cancelled() 或同时检查 done() && not cancelled(),以确保 plugin.shutdown() 等待并终止实际的 worker 任务(引用符号:task, bus._worker_task, plugin.shutdown)。plugin/plugins/study_companion/knowledge_tracker.py (1)
457-479: ⚡ Quick win批量能力探测和实际调用面不一致喵。
_can_batch_answer_data()现在只检查_supports_batch_answer和batch_write_answer_data,但批量分支马上还会调用answer_write_lock()、load_answer_write_state()。只要某个 store 或测试桩只实现了前两项,这里就会在 fallback 之前直接AttributeError掉喵。建议把这两个方法也纳入 gate,或者缺失时直接退回 legacy 喵。Also applies to: 608-611
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@plugin/plugins/study_companion/knowledge_tracker.py` around lines 457 - 479, The batch-capability gate in _can_batch_answer_data is incomplete: it only checks _supports_batch_answer and batch_write_answer_data but the batch path also expects store.answer_write_lock() and load_answer_write_state(), which can cause AttributeError before falling back; update _can_batch_answer_data to also require answer_write_lock and load_answer_write_state on the store (and any test stubs) or, if those methods are missing, ensure the call sites that invoke answer_write_lock() and load_answer_write_state() (the branch that calls _on_answer_batch from the code around _on_answer_batch/_on_answer_legacy and the similar logic at lines ~608-611) immediately fall back to _on_answer_legacy instead of attempting the batch flow.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@plugin/plugins/study_companion/__init__.py`:
- Around line 405-409: The call to close_ocr() on self._ocr_pipeline can raise
and currently isn't caught, which may prevent clearing self._ocr_pipeline and
subsequent shutdown steps; wrap the close_ocr() invocation in a try/except
(mirroring _cleanup_after_failed_startup) to catch any Exception, log the error
(e.g., self.logger.exception or self.logger.error with the exception), and
ensure self._ocr_pipeline is set to None and remaining
shutdown/state-save/store-close logic runs (use finally or set the attribute
after the try/except) so the plugin state remains consistent.
In `@plugin/plugins/study_companion/_event_bus.py`:
- Around line 180-203: The worker currently does an immediate return when
_worker_failure_count >= _MAX_WORKER_FAILURES which leaves queued events and
_scheduled_emit_count stuck; instead of returning directly from the inner loop,
replace the immediate return with logic that preserves/backfills the backlog:
when should_stop is true, if self._scheduled_emit_count > 0 call
self.schedule_emit() (so queued emits will trigger a fresh worker) and then
break out of the loop (allowing the outer finally to clear self._worker_task),
ensuring you do not silently drop or leave events; reference symbols:
_worker_failure_count, _MAX_WORKER_FAILURES, should_stop, _scheduled_emit_count,
schedule_emit(), _safe_task_done, _worker_task.
In `@plugin/plugins/study_companion/knowledge_tracker.py`:
- Around line 263-282: The cached index (_ensure_index) is truncated to
_TOPIC_INDEX_LIMIT and _topic_name_index/_topic_id_index can't be treated as the
full set; modify discover_candidate() and upsert_candidate() to, when the index
was truncated (detect via len(topics) >= _TOPIC_INDEX_LIMIT or count_topics() >
_TOPIC_INDEX_LIMIT), perform an exact store-side lookup instead of relying only
on _topic_name_index: call the store APIs (e.g., list_topics with filters or a
get_topic_by_name/get_topic_by_id if available) to verify existence by name or
id, and use that authoritative result for matching (also apply the same change
for full-text / first-line fallback logic so it queries the store when the index
is truncated to avoid creating duplicate candidate records).
In `@plugin/plugins/study_companion/store_knowledge.py`:
- Around line 303-319: candidate_status_counts currently runs two separate
SELECTs (grouped counts and total) outside of _lock which can yield inconsistent
snapshots; fix by computing total from the grouped result or by issuing a single
aggregated SQL so both values come from the same snapshot. Specifically, inside
candidate_status_counts use the rows returned by the first query on
candidate_knowledge_items (via _require_read_conn().execute(...).fetchall()) to
derive total (sum of COUNT values) or replace the two queries with one query
that returns grouped counts plus the overall count in one result; ensure the
computation remains protected by the existing _lock usage in this class so the
returned by_status/by_type and total are consistent.
In `@plugin/plugins/study_companion/store.py`:
- Around line 209-227: The PRAGMA journal_mode=WAL call in
_configure_connection_journal currently ignores the returned journal mode string
(SQLite often returns "delete" instead of raising), so change
_configure_connection_journal to capture and return the PRAGMA response (actual
mode string or a boolean like is_wal), log the returned value on both success
and fallback, and ensure it attempts the DELETE fallback only if WAL wasn't
returned; then update _require_read_conn to check the returned mode from
_configure_connection_journal and disable the independent read connection (or
fall back to the locked read path) when the mode is not "wal", using the
existing _log_warning for any non-wal cases.
In `@plugin/plugins/study_companion/study_ocr_pipeline.py`:
- Around line 361-363: The current _calculate_thumbnail_phash(image:
Image.Image) returns an empty string when the optional imagehash dependency is
missing, which the caller then treats as a real phash and flags every frame as
changed; change the function to return None (adjust signature to ->
Optional[str]) when imagehash is unavailable, and update caller logic that reads
its return value (the code that computes has_content_change) to treat None as
"detection disabled" (skip phash comparison and do not set
has_content_change=True). Ensure all uses of _calculate_thumbnail_phash handle
the Optional return safely.
- Around line 116-121: The current timeout handling in capture_lightweight
(where jpeg_future.result(timeout=3.0) / ocr_future.result(timeout=5.0) leads to
jpeg_future.cancel()/ocr_future.cancel()) and close() (which only does
executor.shutdown(wait=False)) doesn't stop already-started worker tasks and
leaves self._executor reusable, causing resource contention; update the flow so
that on any timeout or cancel path you mark the current ThreadPoolExecutor
(self._executor) as unusable and replace it with a fresh executor (set
self._executor = None or create a new ThreadPoolExecutor) to isolate background
tasks, and/or add an interrupt mechanism to worker functions
(_encode_lightweight_jpeg and _extract_image) by accepting a stop_event or
periodic timeout checks so long-running tasks can exit early; ensure close()
also marks self._executor unusable (shutdown then set to None) and
capture_lightweight recreates or lazily creates a new executor before submitting
new futures.
---
Nitpick comments:
In `@plugin/plugins/study_companion/knowledge_tracker.py`:
- Around line 457-479: The batch-capability gate in _can_batch_answer_data is
incomplete: it only checks _supports_batch_answer and batch_write_answer_data
but the batch path also expects store.answer_write_lock() and
load_answer_write_state(), which can cause AttributeError before falling back;
update _can_batch_answer_data to also require answer_write_lock and
load_answer_write_state on the store (and any test stubs) or, if those methods
are missing, ensure the call sites that invoke answer_write_lock() and
load_answer_write_state() (the branch that calls _on_answer_batch from the code
around _on_answer_batch/_on_answer_legacy and the similar logic at lines
~608-611) immediately fall back to _on_answer_legacy instead of attempting the
batch flow.
In `@plugin/plugins/study_companion/service.py`:
- Around line 33-72: The diff shows build_status_payload using copy.deepcopy
while build_tutor_payload still uses json_copy, causing inconsistent copying
semantics; pick one strategy and apply it consistently. Decide whether to prefer
deepcopy or json_copy (given StudyState fields are JSON-friendly deepcopy is
acceptable), then replace calls to json_copy in build_tutor_payload with
copy.deepcopy (or vice versa replace deepcopy uses with json_copy) and update
imports/usages accordingly (look for functions build_status_payload,
build_tutor_payload, and the json_copy symbol) so both payload builders use the
same copy function across the file.
In `@plugin/tests/unit/plugins/test_study_companion_knowledge_tracker.py`:
- Around line 295-338: The test
test_study_store_batch_write_answer_data_keeps_answer_when_candidates_fail
duplicates seed path construction and StudyStore instantiation that already
exists in the helper function _store; modify _store to accept an optional logger
parameter (e.g., def _store(tmp_path: Path, logger: Optional[Logger]=None) or
similar) and use it when constructing StudyStore, then update this test to call
store = _store(tmp_path, logger) instead of reassembling the seed path and
calling StudyStore directly; ensure the unique symbols referenced are _store and
StudyStore so callers keep the same behavior while allowing injection of the
custom logger.
In `@plugin/tests/unit/plugins/test_study_companion.py`:
- Around line 3884-3895: 在测试中除了断言 bus._worker_task 为 None 之外,还需要断言之前保存的 task
对象已经结束以确保 worker 实际停止;在现有代码片段里对变量 task(由 bus.schedule_emit 返回的任务)添加断言例如 assert
task.done(),如需更严格可改为断言 task.cancelled() 或同时检查 done() && not cancelled(),以确保
plugin.shutdown() 等待并终止实际的 worker 任务(引用符号:task, bus._worker_task,
plugin.shutdown)。
In `@plugin/tests/unit/plugins/test_study_event_bus.py`:
- Around line 584-613: The test currently peeks into the private deque via
bus._queue._queue which relies on CPython internals; instead, drain the queue
using public APIs then restore it to preserve state: loop while not
bus._queue.empty() and call bus._queue.get_nowait() to collect items into a
list, perform your assertions on that list, then put the items back into
bus._queue in the same order with put_nowait so the worker state is unchanged;
update test_schedule_emit_drops_when_backlog_is_full to use this
drain-and-restore approach rather than accessing bus._queue._queue directly.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro Plus
Run ID: cd0b6919-d7e7-492d-a801-7411d3a302a4
📒 Files selected for processing (18)
plugin/plugins/study_companion/__init__.pyplugin/plugins/study_companion/_event_bus.pyplugin/plugins/study_companion/entry_communication_review_events.pyplugin/plugins/study_companion/knowledge_tracker.pyplugin/plugins/study_companion/memory_schema.pyplugin/plugins/study_companion/service.pyplugin/plugins/study_companion/store.pyplugin/plugins/study_companion/store_fsrs.pyplugin/plugins/study_companion/store_knowledge.pyplugin/plugins/study_companion/store_qa.pyplugin/plugins/study_companion/store_topics.pyplugin/plugins/study_companion/study_ocr_pipeline.pyplugin/tests/unit/plugins/test_study_companion.pyplugin/tests/unit/plugins/test_study_companion_knowledge_tracker.pyplugin/tests/unit/plugins/test_study_companion_memory_schema.pyplugin/tests/unit/plugins/test_study_companion_service_ui_api.pyplugin/tests/unit/plugins/test_study_companion_study_ocr_pipeline.pyplugin/tests/unit/plugins/test_study_event_bus.py
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
plugin/plugins/study_companion/study_ocr_pipeline.py (1)
261-284:⚠️ Potential issue | 🟠 Major | ⚡ Quick win同步 JPEG 回退这里会把编码异常直接抛给调用方喵
jpeg_future超时/失败后会走 Line 280-284 的同步_encode_lightweight_jpeg(),但这段不在try/except里喵。_encode_lightweight_jpeg()自己在 Line 368-370 会抛RuntimeError,所以这里不会像前面的分支那样返回LightweightSnapshot(status="capture_failed"),而是直接把异常冒出去喵。既然这个方法整体在做“失败降级”,这里也需要同样的兜底喵。🐾 可以这样补一个本地兜底喵
- if jpeg_bytes is None: - jpeg_bytes = self._encode_lightweight_jpeg( - thumbnail, - max_bytes=self._config.awareness.image_max_bytes, - ) + if jpeg_bytes is None: + try: + jpeg_bytes = self._encode_lightweight_jpeg( + thumbnail, + max_bytes=self._config.awareness.image_max_bytes, + ) + except Exception as exc: + return LightweightSnapshot( + status="capture_failed", + captured_at=captured_at, + diagnostic=f"jpeg encode failed: {exc}", + window_title=window_title, + app_type=app_type, + thumbnail_phash=thumbnail_phash, + has_content_change=has_content_change, + )🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@plugin/plugins/study_companion/study_ocr_pipeline.py` around lines 261 - 284, The fallback path that calls _encode_lightweight_jpeg(thumbnail, ...) when jpeg_bytes is None can raise RuntimeError and currently escapes the surrounding failure-handling; wrap that call in a try/except and on exception create and return the same kind of capture-failed LightweightSnapshot (or set jpeg_bytes to None and construct LightweightSnapshot(status="capture_failed", ...) consistent with the ocr failure branch), while also calling jpeg_future.cancel()/_retire_executor(executor) as done elsewhere; update the block that currently sets jpeg_bytes to instead handle exceptions from _encode_lightweight_jpeg and produce the appropriate LightweightSnapshot or error diagnostic.
♻️ Duplicate comments (1)
plugin/plugins/study_companion/knowledge_tracker.py (1)
396-411:⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift索引截断后的正文匹配还是会把旧 topic 裂成新 candidate 喵
这次补的是
normalized和first的精确回查喵,但 Line 400-402 这条“正文里包含 topic 名”的路径还是只扫被截断的_topic_name_index喵。也就是说,当真实 topic 落在_TOPIC_INDEX_LIMIT之外时,像"... Late Topic ..."这种输入依旧会落到后面的upsert_candidate(),继续把已存在 topic 误写成新 candidate 喵。这个根因和之前提过的是同一个,只是现在还剩 substring 匹配这条漏网路径喵。截断模式下这里要么补 store 侧的名称检索,要么干脆禁用这种基于部分索引的 substring 自动发现喵。🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@plugin/plugins/study_companion/knowledge_tracker.py` around lines 396 - 411, The substring match over _topic_name_index (the for loop that checks "if name and name in normalized") still runs when self._index_truncated, causing existing topics beyond _TOPIC_INDEX_LIMIT to be treated as new candidates; update this path to either (a) bypass substring-based discovery when self._index_truncated (i.e. skip the name-in-normalized check) or (b) perform a store-side resolution via self._resolve_store_topic(name or normalized substring) before returning topic_id; ensure the change references _index_truncated, _topic_name_index, _resolve_store_topic and prevents upsert_candidate from being invoked for already-existing topics when the index is truncated.
🧹 Nitpick comments (1)
plugin/tests/unit/plugins/test_study_companion.py (1)
3951-3977: ⚡ Quick winOCR pipeline 清理失败时的优雅降级测试很赞喵~
这个测试验证了关闭失败时的容错行为喵,确保 shutdown() 不会因为 OCR pipeline 清理失败而整体失败喵。符合 PR 目标中提到的"降级优雅并记录诊断信息"喵!
不过本喵有个小建议喵:可以考虑验证一下 logger.warnings 中的具体错误消息是否包含 "ocr close failed" 原始异常信息,这样能确认诊断信息的完整性喵~
💡 可选的断言增强建议喵
在 line 3975-3977 的断言后,可以增加一个验证原始异常信息是否被记录的断言喵:
assert any( "study shutdown OCR pipeline cleanup failed" in str(item[0][0]) for item in ctx.logger.warnings ) + assert any( + "ocr close failed" in str(item) + for item in ctx.logger.warnings + )这样可以确保诊断信息不仅包含框架层的消息,也包含底层的具体错误喵~
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@plugin/tests/unit/plugins/test_study_companion.py` around lines 3951 - 3977, Add an assertion to test_shutdown_clears_ocr_pipeline_when_close_fails that verifies the original exception text is present in the logged warning: after calling plugin.shutdown() and checking plugin._ocr_pipeline is None, scan ctx.logger.warnings for an entry that includes the substring "ocr close failed" (thrown by _FailingClosePipeline.close()) in addition to the existing "study shutdown OCR pipeline cleanup failed" check so the test confirms the original exception message is preserved in the logger output from plugin.shutdown().
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Outside diff comments:
In `@plugin/plugins/study_companion/study_ocr_pipeline.py`:
- Around line 261-284: The fallback path that calls
_encode_lightweight_jpeg(thumbnail, ...) when jpeg_bytes is None can raise
RuntimeError and currently escapes the surrounding failure-handling; wrap that
call in a try/except and on exception create and return the same kind of
capture-failed LightweightSnapshot (or set jpeg_bytes to None and construct
LightweightSnapshot(status="capture_failed", ...) consistent with the ocr
failure branch), while also calling
jpeg_future.cancel()/_retire_executor(executor) as done elsewhere; update the
block that currently sets jpeg_bytes to instead handle exceptions from
_encode_lightweight_jpeg and produce the appropriate LightweightSnapshot or
error diagnostic.
---
Duplicate comments:
In `@plugin/plugins/study_companion/knowledge_tracker.py`:
- Around line 396-411: The substring match over _topic_name_index (the for loop
that checks "if name and name in normalized") still runs when
self._index_truncated, causing existing topics beyond _TOPIC_INDEX_LIMIT to be
treated as new candidates; update this path to either (a) bypass substring-based
discovery when self._index_truncated (i.e. skip the name-in-normalized check) or
(b) perform a store-side resolution via self._resolve_store_topic(name or
normalized substring) before returning topic_id; ensure the change references
_index_truncated, _topic_name_index, _resolve_store_topic and prevents
upsert_candidate from being invoked for already-existing topics when the index
is truncated.
---
Nitpick comments:
In `@plugin/tests/unit/plugins/test_study_companion.py`:
- Around line 3951-3977: Add an assertion to
test_shutdown_clears_ocr_pipeline_when_close_fails that verifies the original
exception text is present in the logged warning: after calling plugin.shutdown()
and checking plugin._ocr_pipeline is None, scan ctx.logger.warnings for an entry
that includes the substring "ocr close failed" (thrown by
_FailingClosePipeline.close()) in addition to the existing "study shutdown OCR
pipeline cleanup failed" check so the test confirms the original exception
message is preserved in the logger output from plugin.shutdown().
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro Plus
Run ID: 70f5e1e7-c227-4860-a216-8d60b3ae2b10
📒 Files selected for processing (12)
plugin/plugins/study_companion/__init__.pyplugin/plugins/study_companion/_event_bus.pyplugin/plugins/study_companion/knowledge_tracker.pyplugin/plugins/study_companion/service.pyplugin/plugins/study_companion/store.pyplugin/plugins/study_companion/store_knowledge.pyplugin/plugins/study_companion/study_ocr_pipeline.pyplugin/tests/unit/plugins/test_study_companion.pyplugin/tests/unit/plugins/test_study_companion_knowledge_quality.pyplugin/tests/unit/plugins/test_study_companion_knowledge_tracker.pyplugin/tests/unit/plugins/test_study_companion_study_ocr_pipeline.pyplugin/tests/unit/plugins/test_study_event_bus.py
🚧 Files skipped from review as they are similar to previous changes (6)
- plugin/plugins/study_companion/service.py
- plugin/plugins/study_companion/init.py
- plugin/tests/unit/plugins/test_study_companion_study_ocr_pipeline.py
- plugin/tests/unit/plugins/test_study_event_bus.py
- plugin/plugins/study_companion/_event_bus.py
- plugin/plugins/study_companion/store.py
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 654bbfa31a
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| self.open() | ||
| return self._require_conn() | ||
| conn = getattr(self._read_local, "conn", None) | ||
| if conn is None: | ||
| with self._lock: | ||
| if self._conn is None: |
There was a problem hiding this comment.
Serialize fallback reads on the writer connection
When WAL is unavailable, this branch returns the shared write connection, but the read call sites then execute queries after this method has released _lock (for example list_topics() and get_raw() call _require_read_conn().execute(...) directly). In DELETE-journal fallback environments, a background status/read can therefore use the same sqlite connection concurrently with a locked batch write, which defeats the previous serialization and can surface intermittent sqlite transaction/connection errors. Keep the lock held for fallback reads or provide a read helper that executes under _lock when _read_connections_enabled is false.
Useful? React with 👍 / 👎.
这个 PR 做了什么本 PR 完成伴学性能优化,并补齐审查中发现的稳定性问题。 核心改动包括:
为什么要这样做本pr的目标不是新增大功能,而是先把伴学链路里的高频路径稳定下来:事件派发、轻量截图、状态构建、答题写入、知识追踪这些路径都会在伴学运行时反复触发。 这批改动优先处理三类风险:
Batch B 执行前提Batch B 不应该直接叠在不稳定的 Batch A 上执行。建议满足以下前提后再开始:
|
|
Caution Failed to replace (edit) comment. This is likely due to insufficient permissions or the comment being deleted. Error details |
主要变更
验证
uv run pytest -q相关 6 个伴学测试文件:197 passeduv run pytest -q plugin/tests/unit/plugins/test_study_companion_knowledge_quality.py plugin/tests/unit/plugins/test_study_companion_knowledge_contribution.py:15 passeduv run ruff check ...:All checks passeduv run python -m compileall ...:通过git diff --check:通过detect_changes:risk low,无 affected processesSummary by CodeRabbit
发行说明
新功能
性能改进
Bug 修复
测试