Skip to content

伴学插件支持文本框图片粘贴与视觉输入#1603

Open
MomiJiSan wants to merge 10 commits into
Project-N-E-K-O:mainfrom
MomiJiSan:feat/study-companion-image-paste
Open

伴学插件支持文本框图片粘贴与视觉输入#1603
MomiJiSan wants to merge 10 commits into
Project-N-E-K-O:mainfrom
MomiJiSan:feat/study-companion-image-paste

Conversation

@MomiJiSan
Copy link
Copy Markdown
Contributor

@MomiJiSan MomiJiSan commented Jun 2, 2026

变更内容

本分支将图片粘贴能力收敛在 study_companion 插件内部,实现伴学面板文本框粘贴图片后,可作为视觉输入参与讲解、出题和答案评价。

主要改动:

  • 伴学面板支持 JPEG/PNG 图片粘贴、压缩、预览、移除和忙碌态保护。
  • study_generate_questionstudy_evaluate_answer 支持 vision_image_base64 参数。
  • 图片粘贴失败时显示面板内联错误提示,避免静默失败。
  • 图片加载增加 30s 超时和 Abort 清理,避免损坏图片导致流程挂起。
  • 禁止画布不可用时回退上传未压缩原图。
  • 提取共享视觉载荷校验 helper,减少入口重复逻辑。
  • study_generate_question 补充外层异常兜底,保留操作上下文。

验证

  • uv run pytest plugin/tests/unit/plugins/test_study_companion_vision.py -q
  • uv run pytest plugin/tests/unit/plugins/test_study_companion.py -q
  • uv run ruff check ...
  • esbuild plugin/plugins/study_companion/surfaces/study_panel.tsx
  • git diff --check
  • GitNexus detect_changes:LOW,0 个受影响流程

风险说明

变更范围已收敛在 study_companion 插件内部及其插件测试文件,不涉及主聊天 Composer 或全局 paste 处理。

Summary by CodeRabbit

  • 新功能

    • 学习面板支持粘贴图片/文本:前端压缩、超时/可取消、预览与移除、错误提示与忙碌态禁用;粘贴图片可作为文本/答案输入并随请求发送,OCR 自动填充与手动编辑联动;评估时若无答案且无图片会直接报错。
    • 插件接口新增可选图片输入与共享可选图片校验/归一化;多端点支持“仅图片”场景并按语言返回中/英/繁提示词。
  • 样式

    • 为图片预览、移除按钮与粘贴错误新增样式及忙碌态视觉反馈。
  • 测试

    • 扩展单元测试,覆盖粘贴契约、图片校验、多语言图片提示与结构化学习条目流程。

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Jun 2, 2026

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 4f479cf4-7d01-423e-8da5-218d4f86a8e8

📥 Commits

Reviewing files that changed from the base of the PR and between a4883e4 and de7491a.

📒 Files selected for processing (2)
  • plugin/plugins/study_companion/static/style.css
  • plugin/plugins/study_companion/surfaces/study_panel.tsx
🚧 Files skipped from review as they are similar to previous changes (2)
  • plugin/plugins/study_companion/static/style.css
  • plugin/plugins/study_companion/surfaces/study_panel.tsx

Walkthrough

PR 为学习伴侣插件添加视觉图像输入支持:新增共享校验并在三处入口接入,前端实现粘贴压缩与 UI 绑定,扩展样式与单元测试覆盖喵。

Changes

视觉图像输入端到端支持

Layer / File(s) Summary
共享的视觉图像校验函数
plugin/plugins/study_companion/entry_common.py
新增 _validate_optional_vision_image_payload,处理空值、LLM 视觉开关校验、JPEG/PNG 归一化及异常转换,并加入 __all__ 导出喵。
结构化条目的视觉输入集成
plugin/plugins/study_companion/entry_tutor_answer_entries.py, plugin/plugins/study_companion/entry_tutor_explain_entries.py, plugin/plugins/study_companion/entry_tutor_question_entries.py
study_evaluate_answerstudy_explain_textstudy_generate_question 中接入 vision_image_base64:更新 schema/参数读取、调用共享校验、在学习上下文 extra 中条件注入视觉载荷或返回错误,包含 image-only prompt 的多语言处理与异常包装喵。
前端粘贴处理与图像压缩
plugin/plugins/study_companion/surfaces/study_panel.tsx
实现带 abort/超时的图片加载、Canvas 压缩与迭代重采样、常量与辅助函数,返回受长度约束的 JPEG base64 或 null,供粘贴处理使用喵。
粘贴处理器、状态与 UI 绑定
plugin/plugins/study_companion/surfaces/study_panel.tsx
实现 createPasteHandlerbeginPasteSignal、pasteControllerRef、mountedRef,新增 pastePendingtextImage/answerImage、错误展示、OCR 自动填充交互、图片预览/移除与请求后清理,并把 vision_image_base64 注入 explain/generateQuestion/evaluateAnswer 参数喵。
前端样式与粘贴契约验证
plugin/plugins/study_companion/static/style.css, plugin/tests/unit/plugins/test_study_companion.py
新增样式支持图片预览、删除按钮、粘贴错误与 busy 态;新增契约测试通过源码断言覆盖粘贴/压缩/超时/abort/注入等前端关键点喵。
视觉输入单元测试覆盖
plugin/tests/unit/plugins/test_study_companion_vision.py
扩展测试基础设施(假 Agent、知识跟踪器、插件工厂),新增多项 sync/async 测试覆盖 schema 校验、mime 校验、vision 开关行为、异常包装与 image-only prompt 场景喵。

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Poem

贴图轻触入面板喵,
Canvas 压缩像素蹦喵,
校验归一进上下文喵,
前后测试把关稳稳喵,
上线啦,大家一同欢呼喵。

🚥 Pre-merge checks | ✅ 4
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed 标题清晰准确地概括了本 PR 的核心变更:为伴学插件添加文本框图片粘贴与视觉输入功能,与大量代码改动保持一致性。
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

if not source_text and not vision_image_payload:

P2 Badge Preserve text for image-only explanations

When the UI sends a pasted image with an empty text box, this new condition lets the request proceed, but study_explain_text then calls concept_explain(source_text, ...) with source_text == ""; tutor_llm_agent_concept_explain.concept_explain immediately returns the empty_input degraded reply before it attaches vision_image_base64. In the image-only paste path, the model never sees the image and the user gets an empty-input response, so pass a small prompt as source_text or teach the agent to invoke vision when only the image is present.

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

used_ocr_fallback = bool(source_text.strip())
source_text = source_text.strip()
if not source_text:
if not source_text and not vision_image_payload:
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Invoke vision for image-only question generation

This now accepts vision_image_base64 without any text/OCR, but in that scenario source_text remains empty and the later call to self._agent.question_generate(source_text, ...) short-circuits in tutor_llm_agent_question_generate.question_generate to the empty_input fallback before _invoke_structured_operation can attach the image. Users pasting only a diagram/photo and clicking Generate Question therefore get the generic empty-input fallback instead of a vision-generated question.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
plugin/plugins/study_companion/surfaces/study_panel.tsx (1)

110-110: ⚡ Quick win

将超时错误消息国际化喵~

发现在 Line 110 这里,超时错误消息 "图片加载超时" 是硬编码的中文字符串喵。虽然这个错误只会在开发控制台显示(不会直接展示给用户),但为了保持代码库的国际化一致性,建议也使用翻译系统喵。其他地方的错误消息都通过 t() 函数翻译了(比如 Line 722-723),这里也应该保持一致才对喵~

如果有非中文背景的开发者在调试时看到这个消息,可能会有点困惑喵。

🌸 建议的修改方式喵

可以考虑将超时消息改为英文(因为这是开发控制台消息):

   const timeoutPromise = new Promise<never>((_, reject) => {
-    timeoutId = window.setTimeout(() => reject(new Error('图片加载超时')), timeoutMs);
+    timeoutId = window.setTimeout(() => reject(new Error('Image load timeout')), timeoutMs);
   });

或者如果团队希望所有消息都支持i18n,也可以考虑传入一个翻译后的消息参数喵~

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@plugin/plugins/study_companion/surfaces/study_panel.tsx` at line 110, Replace
the hard-coded Chinese timeout message used in the setTimeout rejection with a
translated message via the existing i18n helper (use t(...)); specifically
update the timeout callback that calls reject(new Error('图片加载超时')) so it calls
reject(new Error(t('image.load_timeout'))) or another appropriate i18n
key/value, keeping the same symbols (timeoutId, reject, timeoutMs) and ensuring
the t() function is imported/available in study_panel.tsx before use.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@plugin/plugins/study_companion/surfaces/study_panel.tsx`:
- Line 110: Replace the hard-coded Chinese timeout message used in the
setTimeout rejection with a translated message via the existing i18n helper (use
t(...)); specifically update the timeout callback that calls reject(new
Error('图片加载超时')) so it calls reject(new Error(t('image.load_timeout'))) or
another appropriate i18n key/value, keeping the same symbols (timeoutId, reject,
timeoutMs) and ensuring the t() function is imported/available in
study_panel.tsx before use.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 52c29d29-628b-4890-b38e-34ec568bb7bf

📥 Commits

Reviewing files that changed from the base of the PR and between c3e3084 and c1468db.

📒 Files selected for processing (8)
  • plugin/plugins/study_companion/entry_common.py
  • plugin/plugins/study_companion/entry_tutor_answer_entries.py
  • plugin/plugins/study_companion/entry_tutor_explain_entries.py
  • plugin/plugins/study_companion/entry_tutor_question_entries.py
  • plugin/plugins/study_companion/static/style.css
  • plugin/plugins/study_companion/surfaces/study_panel.tsx
  • plugin/tests/unit/plugins/test_study_companion.py
  • plugin/tests/unit/plugins/test_study_companion_vision.py

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
plugin/plugins/study_companion/entry_tutor_explain_entries.py (1)

21-21: 💤 Low value

仅图片场景的提示词是硬编码英文,和同插件里的中文回退不一致喵。

L54 的 study_submit_image 用的是中文回退 "请查看这张图片的内容",而这里新加的 IMAGE_ONLY_EXPLAIN_PROMPT 却是固定英文 "Please explain the pasted image."。插件本身有 self._cfg.language,对中文用户来说这个 prompt 会作为 source_text 进入历史/上下文,体验上有点割裂喵~ 哼,要不要按 language 做下本地化呀,笨蛋作者~(不改也不会炸,纯属强迫症提醒喵)

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@plugin/plugins/study_companion/entry_tutor_explain_entries.py` at line 21,
IMAGE_ONLY_EXPLAIN_PROMPT is hardcoded in English while study_submit_image uses
a Chinese fallback; make the prompt respect the plugin language setting by
replacing the constant with language-aware selection (use self._cfg.language
inside the same module/class) or derive it from the same Chinese fallback
("请查看这张图片的内容") when language is 'zh' (and a suitable English string otherwise),
and ensure IMAGE_ONLY_EXPLAIN_PROMPT (or its replacement) is used consistently
by study_submit_image and any other callers so the source_text matches the
user's language.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@plugin/plugins/study_companion/entry_tutor_explain_entries.py`:
- Line 21: IMAGE_ONLY_EXPLAIN_PROMPT is hardcoded in English while
study_submit_image uses a Chinese fallback; make the prompt respect the plugin
language setting by replacing the constant with language-aware selection (use
self._cfg.language inside the same module/class) or derive it from the same
Chinese fallback ("请查看这张图片的内容") when language is 'zh' (and a suitable English
string otherwise), and ensure IMAGE_ONLY_EXPLAIN_PROMPT (or its replacement) is
used consistently by study_submit_image and any other callers so the source_text
matches the user's language.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 28b64454-a4f0-4016-bebf-be5bb62683f2

📥 Commits

Reviewing files that changed from the base of the PR and between c1468db and 5e1e75c.

📒 Files selected for processing (5)
  • plugin/plugins/study_companion/entry_tutor_explain_entries.py
  • plugin/plugins/study_companion/entry_tutor_question_entries.py
  • plugin/plugins/study_companion/surfaces/study_panel.tsx
  • plugin/tests/unit/plugins/test_study_companion.py
  • plugin/tests/unit/plugins/test_study_companion_vision.py
✅ Files skipped from review due to trivial changes (1)
  • plugin/plugins/study_companion/surfaces/study_panel.tsx
🚧 Files skipped from review as they are similar to previous changes (2)
  • plugin/tests/unit/plugins/test_study_companion.py
  • plugin/plugins/study_companion/entry_tutor_question_entries.py

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 8851217876

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines +87 to +89
if not source_text and vision_image_payload:
source_text = _image_only_question_prompt(self._cfg.language)
image_only_source = True
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Prefer pasted image over stale OCR for question generation

When the user pastes only an image while last_ocr_text is still populated from an earlier snapshot, the earlier OCR fallback fills source_text before this image-only branch runs, so the branch is skipped and the request is labeled ocr_snapshot rather than vision_image. In that scenario, Generate Question sends stale OCR text alongside the new image instead of using the image-only prompt, which can produce questions about the previous screen; skip the OCR fallback whenever vision_image_base64 is supplied without explicit text.

Useful? React with 👍 / 👎.

Comment on lines +165 to +167
if not source_text:
source_text = _image_only_explain_prompt(self._cfg.language)
image_only_source = True
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Prefer pasted image over stale OCR for image-only explanations

For an image-only Explain action with a previous OCR snapshot in state, source_text has already been filled from last_ocr_text before this check, so the image-only prompt is never selected and the context source remains ocr_snapshot. This makes a pasted image get explained with stale OCR text from a prior screen; the OCR fallback should only run when no pasted vision image was provided.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
plugin/plugins/study_companion/entry_tutor_explain_entries.py (1)

21-32: ⚡ Quick win

EN 与 ZH 提示词的指令语义不一致喵~

英文版让模型「explain(解释)」图片,但中文版(请查看这张图片的内容 / 請查看這張圖片的內容)只是让模型「查看内容」,作为「解释/讲解」功能的注入提示词,给到视觉模型的指令方向就不一样了喵。这会让不同语言下的输出风格/质量出现偏差,建议把中文也对齐成「解释」语气,杂鱼开发者可不要偷懒喵~

♻️ 建议对齐中文提示词语气
-IMAGE_ONLY_EXPLAIN_PROMPT_ZH_CN = "请查看这张图片的内容"
-IMAGE_ONLY_EXPLAIN_PROMPT_ZH_TW = "請查看這張圖片的內容"
+IMAGE_ONLY_EXPLAIN_PROMPT_ZH_CN = "请解释这张图片的内容"
+IMAGE_ONLY_EXPLAIN_PROMPT_ZH_TW = "請解釋這張圖片的內容"

注意:上下文测试 test_study_explain_text_uses_prompt_for_image_only(test_study_companion_vision.py:942-952)硬编码断言了 请查看这张图片的内容,如果采纳本改动,对应断言也要同步更新喵。

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@plugin/plugins/study_companion/entry_tutor_explain_entries.py` around lines
21 - 32, The Chinese prompt strings are phrased as "please view" rather than
"please explain," causing inconsistent instruction semantics; update
IMAGE_ONLY_EXPLAIN_PROMPT_ZH_CN and IMAGE_ONLY_EXPLAIN_PROMPT_ZH_TW to match the
English "explain" intent (e.g., use a phrasing like "请解释这张图片的内容" / "請解釋這張圖片的內容")
and ensure the helper _image_only_explain_prompt still returns those constants
unchanged; if you change these values, also update any tests that hard-code the
old Chinese string (the test referencing
test_study_explain_text_uses_prompt_for_image_only).
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@plugin/plugins/study_companion/entry_tutor_explain_entries.py`:
- Around line 21-32: The Chinese prompt strings are phrased as "please view"
rather than "please explain," causing inconsistent instruction semantics; update
IMAGE_ONLY_EXPLAIN_PROMPT_ZH_CN and IMAGE_ONLY_EXPLAIN_PROMPT_ZH_TW to match the
English "explain" intent (e.g., use a phrasing like "请解释这张图片的内容" / "請解釋這張圖片的內容")
and ensure the helper _image_only_explain_prompt still returns those constants
unchanged; if you change these values, also update any tests that hard-code the
old Chinese string (the test referencing
test_study_explain_text_uses_prompt_for_image_only).

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 036e655b-4638-4ffd-8d0c-69609bdca2fd

📥 Commits

Reviewing files that changed from the base of the PR and between 5e1e75c and 8851217.

📒 Files selected for processing (3)
  • plugin/plugins/study_companion/entry_tutor_explain_entries.py
  • plugin/plugins/study_companion/entry_tutor_question_entries.py
  • plugin/tests/unit/plugins/test_study_companion_vision.py
🚧 Files skipped from review as they are similar to previous changes (2)
  • plugin/tests/unit/plugins/test_study_companion_vision.py
  • plugin/plugins/study_companion/entry_tutor_question_entries.py

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 93ca0c8c90

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

if (image === null) {
setters.setPasteError(setters.pasteErrorMessage);
} else {
setters.setImage(image);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Clear stale OCR when accepting image-only paste

When the textarea has been auto-filled from last_ocr_text by refresh() and the user then pastes only a new image, this accepts the image but leaves the old OCR text in text; the Generate/Explain handlers send both { text, vision_image_base64 }, so the backend treats the stale OCR as manual context instead of taking its image-only path. This still reproduces the stale-OCR-with-new-image behavior even though the entry methods now skip OCR fallback when the text is empty.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 3df8ca2ca7

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

const controller = beginStudyRequest();
setBusy(true);
const evalArgs: Record<string, unknown> = { answer, question };
if (answerImage) evalArgs.vision_image_base64 = answerImage;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Allow image-only answers to reach evaluation

When the learner pastes an answer image but leaves the answer textarea empty, the earlier if (!answer.trim()) guard returns before this image payload is added, so the new answer-image path is unreachable for handwritten/diagram answers unless the user types filler text. Since the backend now accepts vision_image_base64 without requiring a non-empty answer, the UI should treat answerImage as satisfying the answer requirement.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 5f298f1095

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines +510 to +515
setText((prev) => {
if (prev.trim() || !data.last_ocr_text) {
return prev;
}
textAutoFilledFromOcrRef.current = true;
return data.last_ocr_text;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Prevent OCR autofill after image submissions

When an image-only Explain/Generate request completes while last_ocr_text still contains an older snapshot, the subsequent refresh(..., { updateReply: false }) runs before the pasted image is cleared and this block repopulates the empty textarea with that stale OCR. The next action is then sent as manual text instead of the pasted-image context, reintroducing stale-OCR behavior after a successful image-only submission; skip this autofill when a text image is present or when the request just consumed an image.

Useful? React with 👍 / 👎.

Comment on lines +266 to +275
try {
const image = await compressImageForStudy(blob, signal);
if (signal.aborted || !isMounted()) {
return;
}
if (image === null) {
setters.setPasteError(setters.pasteErrorMessage);
} else {
setters.onImageAccepted?.();
setters.setImage(image);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Block actions until paste compression finishes

For large pasted JPEG/PNG images, compression is asynchronous and the action buttons remain enabled until setImage runs. If the user pastes an image and immediately clicks Explain/Generate/Evaluate, those handlers read textImage/answerImage while still empty and submit the request without the pasted image, so image-only pastes can fail or use stale text. Track a paste-pending state (or disable/abort actions) until this await completes and the image state is committed.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 6319e2bc82

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines +531 to +532
if (textImage || prev.trim() || !data.last_ocr_text) {
return prev;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Use a ref when suppressing OCR autofill

If a refresh() call that was started before the user pastes an image resolves after setTextImage(image), this closure still sees the old textImage === '', so the guard lets last_ocr_text populate an otherwise empty textarea. That leaves a newly pasted image paired with stale OCR as manual text for the next Explain/Generate call; fresh evidence here is that the new autofill guard reads captured React state instead of a synchronously updated image ref.

Useful? React with 👍 / 👎.

@MomiJiSan
Copy link
Copy Markdown
Contributor Author

这个 PR 主要把“文本框粘贴图片作为伴学视觉输入”的能力实现并收敛在 study_companion 插件内部。

做了什么:

  • 伴学面板支持在文本框中粘贴 JPEG/PNG 图片,并提供压缩、预览、移除和忙碌态保护。
  • 粘贴的图片会作为 vision_image_base64 传给讲解、出题和答案评价入口。
  • 后端为 study_generate_questionstudy_evaluate_answer 增加视觉图片参数支持。
  • 图片处理失败时改为面板内联错误提示,不再只写 console.warn 后静默失败。
  • 图片加载增加超时和 Abort 处理,避免损坏图片或超大图片让粘贴流程卡住。
  • 禁止画布不可用时回退上传未压缩原图,避免大图直接进入后端造成内存和带宽压力。
  • 抽出共享视觉载荷校验 helper,减少讲解、出题、评价入口的重复校验逻辑。
  • study_generate_question 增加外层异常兜底,确保异常返回保留操作上下文。

为什么这样做:

  • 方案 C 的目标是让图片粘贴能力只服务伴学插件,不影响主聊天 Composer 或全局 paste 逻辑,所以本 PR 没有修改主聊天输入框和全局图片上传流程。
  • 图片粘贴失败必须对用户可见,否则用户会以为功能无响应;因此用插件面板内联提示替代静默丢弃。
  • 前端先压缩并限制 JPEG/PNG,可以减少后端 10MB 限制前的无效传输和内存消耗。
  • 加载超时、Abort 清理和禁止未压缩回退,是为了避免异常图片把 UI 流程挂死或把大图直接送入后端。
  • 共享校验逻辑能保证三个伴学入口对视觉输入的处理一致,后续维护也更安全。

验证:

  • 插件相关单测通过。
  • ruff、TSX 编译检查、git diff --check 通过。
  • GitNexus 变更检测为 LOW,0 个受影响流程。

解决 study_panel.tsx 冲突:保留 PR 的图片粘贴 ref 组与 data-busy 忙碌态属性,同时合入 main(Project-N-E-K-O#1606 Phase 9) 新增的 panelRef 与无障碍 role/aria-label 属性。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant