Skip to content

VideoCaptioner 2.0: workbench UI, dubbing, unified naming, core download engine#1150

Open
WEIFENG2333 wants to merge 63 commits into
masterfrom
feature/dubbing-tts-settings
Open

VideoCaptioner 2.0: workbench UI, dubbing, unified naming, core download engine#1150
WEIFENG2333 wants to merge 63 commits into
masterfrom
feature/dubbing-tts-settings

Conversation

@WEIFENG2333

Copy link
Copy Markdown
Owner

Summary

The 2.0 branch: first-party workbench UI across all pages, a full dubbing workflow (Edge / Gemini / SiliconFlow CosyVoice with voice clone), and the supporting architecture work below.

Architecture

  • Shared config: TOML store → AppConfig dataclass → TaskBuilder; CLI and GUI consume the same canonical config through thin adapters.
  • Output naming: one grammar {stem}.{tag}.{ext} (tags: lang code / optimized / subtitled / dubbed) defined once in core/application/output_paths.py; intermediates live in per-run task directories and are cleaned on success; raw TTS segments become a content-addressed cache.
  • Download engine in core: core/download/media.py (no Qt) shared by GUI thread, CLI download, and diagnostics; one browser-cookie fallback ladder so "diagnostics says OK" matches real download behavior.
  • Thread layer: unified WorkerThread base with cooperative cancellation.

Packaging

  • Wheel ships resource/fonts (default style font LXGW WenKai), declares httpx, drops unused aiohttp; verified by real wheel install into a clean venv + windowed GUI launch.
  • PyInstaller spec: console=False, stale hidden import removed; verified by a real desktop build, frozen CLI synthesis, and GUI launch.

Assets & docs

  • Voice previews re-encoded wav→mp3 (8.4MB → 2.3MB, 55 files).
  • README preview images cropped with transparent rounded corners.
  • docs/dev pruned to adopted design mocks (design-<page>.html, one per page); CONTRIBUTING added; core/realtime removed.

Test plan

  • ruff clean (repo-wide), pyright clean on videocaptioner/cli/
  • 217 tests green locally (CI scope test_cli + test_dubbing: 110)
  • uv build + wheel install smoke (CLI + windowed GUI) + PyInstaller bundle smoke
  • Real e2e: transcription, subtitle pipeline, hard/soft synthesis, Edge dubbing, YouTube download with subtitle sidecar

🤖 Generated with Claude Code

WEIFENG2333 and others added 14 commits June 5, 2026 23:07
- Output naming: one grammar ({stem}.{tag}.{ext}) shared by CLI and GUI via
  core/application/output_paths.py; intermediates live in per-run task dirs
  under work_dir/tasks/ and are cleaned on success (app.keep_intermediates
  opt-out); raw TTS segments become a content-addressed cache in CACHE_PATH.
- Download: yt-dlp engine moved to core/download/media.py (no Qt), shared by
  the GUI thread, CLI download, and diagnostics; browser-cookie fallback
  ladder lives in net.py so the source check only reports unavailable after
  the same ladder real downloads use; error messages strip ANSI, keep the
  original site error as the headline, and separate TCC permission failures
  from real rejections.
- Packaging: wheel now ships resource/fonts (LXGW WenKai default style font),
  declares httpx, drops unused aiohttp; PyInstaller spec console=False and
  stale modelscope hidden import removed; verified by real wheel install,
  frozen build, and windowed GUI launches.
- Assets/docs: voice previews re-encoded wav->mp3 (8.4MB -> 2.3MB); README
  preview images cropped with transparent rounded corners; docs/dev pruned to
  adopted design mocks renamed design-<page>.html; CONTRIBUTING added;
  core/realtime removed.
- UI: settings folder rows use a shared read-only path + open/change control.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: a56600940c

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread videocaptioner/ui/common/app_icons.py Outdated
@@ -0,0 +1,135 @@
from __future__ import annotations

from enum import StrEnum

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Replace StrEnum for Python 3.10 support

This project still declares requires-python = ">=3.10,<3.13", but enum.StrEnum is only available starting in Python 3.11. On a supported Python 3.10 install, importing videocaptioner.ui.common.app_icons raises ImportError, which prevents UI modules that use AppIcon from loading. Use a str, Enum subclass or raise the package's minimum Python version.

Useful? React with 👍 / 👎.

"auto",
ChoiceValidator(["auto", "first", "second"]),
)
dubbing_timing = ChoiceSettingField(

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Preserve the supported no-timing mode in settings

The core/CLI path accepts dubbing.timing = "none" (resolve_timing() and the CLI --timing none both support it), but the settings state only allows these three values. If a user sets none from the CLI/TOML and later opens the GUI settings, ChoiceValidator coerces it to the first option (natural), and the next settings save overwrites the config, unexpectedly re-enabling time fitting. Include the none option or avoid correcting supported config values here.

Useful? React with 👍 / 👎.

liangweifeng and others added 15 commits June 13, 2026 04:24
A bare exception escaping QThread.run aborts the whole process (qFatal).
RoundedBgPreviewThread already had the try/except guard with a comment
claiming parity with AssPreviewThread — which was in fact unguarded.
Verified by forcing the renderer to raise: thread exits cleanly, no emit.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
- Rebuild the subtitle-style page's library dock and style cards:
  always-on actions (copy/rename/delete with icons), a divider and
  status pill, no description blurb; built-in cards get a duplicate
  action so they no longer look empty. Cards are exact-fit so the
  active card's border is never clipped and the dock has no blank band.
- Preview now fits the stage edge-to-edge, adapting to wide/short/
  fullscreen and capping at native resolution to avoid upscaling blur.
- Add ASS/rounded style params end-to-end: alignment, max width and
  primary/secondary line gap (style_manager, renderers, asr_data,
  entities, task_builder, video synthesis thread + page).
- New first-party color picker (subtitle preset palette + recent
  colors) wired into the inspector and settings color controls.
- New first-party AppLineEdit (single focus ring, no stuck border);
  migrate BoundLineEdit onto it.
- workbench: StepperControl, CompactButton pad_h option, AppLineEdit.
- Rename subtitle_style_controls.py -> inspector_controls.py.
- Fix sidebar subtitle icon via AppFluentIcon(AppIcon.SUBTITLE).
- config: recent_colors + subtitle preview source/target fields.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
get_prompt() fills ${target_language} via Template.safe_substitute, which
str()-ifies the value. llm_translator passed the TargetLanguage enum
member, so prompts rendered "...specializing in TargetLanguage.SIMPLIFIED_
CHINESE..." instead of the language name — silently degrading every LLM
translation (standard/reflect/single). Pass .value so the prompt reads
"...specializing in 简体中文...".

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
- app_icons.AppIcon used enum.StrEnum (3.11+) while the project declares
  requires-python ">=3.10"; on 3.10 the import raises ImportError and every
  UI module importing AppIcon fails to load. Switch to (str, Enum) with a
  __str__ returning the value, preserving StrEnum semantics that path/cache
  lookups rely on (str(icon) -> "subtitle").
- dubbing_timing ChoiceValidator omitted "none" though core/CLI fully
  support it (resolve_timing, --timing none); a CLI/TOML-set "none" was
  coerced to "natural" on opening GUI settings and silently re-enabled time
  fitting on save. Add "none" to the validator and the settings dropdown.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Settings dialog (open/close jank):
- Fade only the cheap window mask, not the heavy 940x680 card+shadow.
  MaskDialogBase ramped opacity on the whole dialog, re-rasterizing the
  card every frame (~9ms each) — content appeared to "load late". Card now
  pops at full opacity (timely), shadow is static so it can be softer.

Subtitle style duplication-on-launch:
- Built-in styles were force-forked with an incrementing id every time the
  selection reset to a built-in (mode round-trips dropped the per-mode
  selection), piling up default-custom-2..N each session.
- Remember selection per renderer mode (restored on ass<->rounded round
  trips) and fork built-ins to a single deterministic {builtin}-custom id
  (reused, not incremented). Adds tests/test_ui/test_subtitle_style_persistence.

Config persistence:
- Bind subtitle_preview_source/target and recent_colors (were SettingField
  + cfg.set(save=True) no-ops) so custom preview text and recent colors
  survive restart; add matching config_store DEFAULTS.

Review-found fixes:
- settings_controls/form_cards/home/llm_logs/subtitle: replace broken QSS
  numeric font-weight (Qt5 maps 700-800 to weight 88-99, over-bold) with the
  valid `bold` keyword.
- download/media: route subtitle re-download through the per-site proxy
  (YouTube needs proxy, Bilibili forced direct); fix sanitize_filename
  control-char regex ([\0-\31] octal bug missed 0x1a-0x1f).
- config: normalize llm.service aliases (siliconcloud/lmstudio) before the
  generic api_key backfill.
- dubbing_interface: remove dead metaLabel + orphan QSS rules.

Verified: ruff clean, compileall, 336 passed/3 skipped (isolated config),
dark/light smoke on touched pages.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…view

Transcribe source-language per interface:
- Add TRANSCRIBE_MODEL_LANGUAGES / transcribe_languages_for in core as the
  single source of truth: B接口(必剪)/J接口(剪映) are zh/en-only cloud ASR
  (they ignore the language param entirely), others support all languages.
- Settings 源语言 dropdown now narrows to {自动检测/中文/英语} for B/J and
  resets an unsupported selection to AUTO on provider switch.

Independent bold for primary/secondary ASS lines:
- One `bold` flag was applied to BOTH lines while the toggle lived only under
  主字幕 — confusing. Add AssSecondaryStyle.bold + a 副字幕 加粗 toggle so each
  line is controlled separately. Legacy styles (no secondary bold) inherit the
  primary bold on load, so existing renders are unchanged.

Faster style preview:
- Content-addressed preview cache (core/subtitle/preview_cache.py): identical
  style+text+bg renders once; ASS<->rounded switching / repeat edits hit the
  cached PNG (~0.1ms) instead of re-running ffmpeg(~250ms)/PIL(~67ms). Bounded
  to 24 files. Also removes the rounded renderer's per-call tempfile leak.
- Leading-edge debounce in update_preview: first render after idle is immediate
  (no blank on open/switch), rapid bursts coalesce to one trailing render so
  ASS(ffmpeg) renders don't pile up.

Preview default text changed to a neutral science/math line matching the
bundled demo background; PREVIEW_TEXT fallback now derives from the config
default (was a 3rd hardcoded copy).

Tests: transcribe language support, primary/secondary bold + legacy inherit +
json round-trip, preview cache (determinism/prune/hit). ruff + 158 passed.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
实时字幕(Live Caption)完整功能,外加一批健壮性与代码质量打磨。

Live Caption(实时字幕):
- core/realtime: 后端无关协议(TranscriptSegment/CaptionEntry)、voxgate /
  fun-asr / qwen-asr 三后端、CaptionAssembler(按 seg_id upsert + 双色 +
  异步翻译)、会话编排、录制/历史/回放、macOS 系统声音(ScreenCaptureKit)
- UI: 桌面浮窗(standard/tall + 窗内设置)、会话/历史/详情三视图、控制页
- doctor: 实时字幕连通性检测(CLI + 诊断页)

依赖与构建:
- core/download/dependencies.py: ffmpeg/voxgate 下载注册表 + 下载弹窗
- native/macsysaudio: ScreenCaptureKit helper(arm64+x86_64 universal),
  源码与产物入库
- build_desktop.py: 打包 macsysaudio

健壮性与质量:
- WS 后端共享 base._reconnect_with_backoff(退避 + 限频 + 失败上限);qwen
  "Connection lost" 经真实 API soak 证实是网络而非代码
- 通用调试开关 VC_DEBUG(core/utils/debug.py)取代专有 VC_LIVE_DEBUG
- 注释精简(删调试轶事/反向辩解),清除代码中所有 design-mock HTML 引用
- AGENTS.md: 更新设计稿与注释约定、后端列表、文件路径

其他: LLM client 按指纹重建单例、translate factory、settings/dubbing 等改动

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
work_dir reorganization:
- Task dirs now group by function under work_dir: transcribe / synthesis /
  batch / dubbing (output_paths.new_task_dir validates task_type; cleanup
  only rmtrees dirs whose parent is a known task_type). Routed by entry page.
- Live caption history moved from APPDATA into {work_dir}/live-caption;
  default_root() helper; migrate_legacy_root made recoverable and called
  once on GUI startup (main.py, not the widget ctor) so tests can't move
  real user data.

WS backend refactor:
- New WebSocketTranscriber base (ws_base.py) collapses ~50 lines duplicated
  across fun_asr / qwen_asr: WS connection state, recv loop, locked send,
  failure teardown. voxgate keeps inheriting LiveTranscriber directly so it
  never pulls websocket into its startup path (backends stay lazy).
- Unify scattered `_closed = True` teardown into _fail_close() (close ws +
  emit STOPPED), matching stop()'s terminal state; the old path leaked the
  ws fd and never told the UI it stopped.

Review fixes:
- base: accept-then-drop reconnect storms (attempt succeeds but connection
  instantly drops, so the fail counter never trips) now terminate via a
  cross-call _reconnect_streak; previously looped every 0.5s forever, never
  emitting text or an error. Covered by a new test.
- live_caption_thread.stop(): longer default wait so voxgate can flush the
  last sentence before the receive thread is joined.
- caption_overlay: parent the settings popover to the overlay so it is
  reclaimed with it instead of leaking an orphan top-level window.
- system_mac: cap helper stderr buffer with a bounded deque.
- recorder: guard against a zero-length cue when the last sentence first
  appears during stop (clicking it used to jump to the end).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
core/hardsub: frame sampling -> change-point OCR -> dedup -> region
auto-detect -> timeline -> ASRData. Region detection and per-frame line
filtering share one "centered-segment + dominant-font" model; junk
(cards/counts/durations/translations/credits) is dropped by centering trim,
global font gate, frame-relative font gate, and a stricter single-line
centering gate. Manual ROI (GUI drag / CLI --roi) is WYSIWYG (roi_is_manual).

core/ocr: RapidOCR(onnxruntime) engine, lazy-loaded; behind optional `ocr`
extra (rapidocr/onnxruntime/rapidfuzz) so base install/startup is unaffected.

GUI: 硬字幕提取 page (ROI selector, result table, redo, send-to-optimize),
nav icon, smoke wiring; closeEvent stops its threads to avoid exit abort.
CLI: extract-hardsub command. Output named {stem}.hardsub.srt.

Speed (~2x): region sample 36->20, reuse the font size learned during region
detection, det_limit 768 for the extraction engine (960 kept for detection).

Incidental: shared config.find_binary (voxgate/macsysaudio/ffmpeg/deps now
share one discovery path); yt-dlp postprocessor hook for merge progress.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ant)

把已失效的 Qt .ts/.qm 翻译机制换成 key-based gettext。只翻 UI(PyQt),
core 与 CLI 维持中文/英文现状。

- 运行时 videocaptioner/ui/i18n/(tr/N_/init/set_language,标准库 gettext);
  main.py 装 catalog + FluentTranslator;非基准语言缺译回退 zh_Hans 中文(不显示 key)。
- ~1429 个 key 覆盖 38 个 UI 文件(tr("域.语义"));枚举下拉标签走
  ui/common/enum_labels.py(程序化 enum_key + TRANSLATABLE_ENUMS);动态 key
  (配音 provider/voice/tag、识别语言)由注册表注入;tr(常量) 的 key 用 N_ 标记。
- 语言切换 = 确认后自动重启(不做逐页 retranslate)。
- 工具链 scripts/i18n.py(extract/update/fill-base/translate/compile/check)+
  babel.cfg + Babel dev 依赖;CI 跑 i18n check + tests/test_i18n。
- zh_Hans 为中文真相源;en/zh_Hant 机器翻译(OpenAI 兼容端点,VC_TRANSLATE_* 配置)。
- 删旧 resource/translations/*.ts|qm + scripts/trans-*.sh + translate_llm.py;
  顺手删无引用的 SQUARE_BUTTON_SIZE / apply_button_icon。
- 文档:docs/dev/i18n-workflow.md(维护手册)、docs/dev/i18n-plan.md(架构)、
  AGENTS.md i18n 章节。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
桌面 onedir 包之前缺新功能依赖(实测:CI uv sync 不带 --extra ocr → 硬字幕 OCR 100% 不可用)。

- build-desktop.yml: uv sync/run 加 --extra ocr,把 rapidocr/onnxruntime/rapidfuzz 装进构建环境。
- VideoCaptioner.spec: collect_data_files('rapidocr')(含自带 PP-OCR 模型,离线可用)+
  collect_dynamic_libs('onnxruntime') + collect_submodules('yt_dlp')(~900 extractor)+
  sounddevice/_sounddevice_data/cffi/websocket/numpy hiddenimports;_safe() 守卫使 base 包仍可打。
- smoke_desktop.py: 校验 onnxruntime/rapidocr 模型/PortAudio 必须真进包,堵住「没进包也 CI 绿」。
- pyproject: 修 rapidocr 过期注释(3.x 自带模型,非运行时下载)。
- 文档 docs/dev/packaging-and-update-plan.md:打包现状/缺口/单文件结论/目录设计/更新重构方案。

真打包验证(mac onedir, 487M):依赖全进包;smoke 全过;打包 exe synthesize hard →
extract-hardsub 端到端 OCR 跑通(正确识别烧录中文字幕),无 cv2×Qt 冲突。
voxgate 按设计运行时下载(不随包)。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…art-install)

Rebuild the update mechanism end to end, replacing the old vc.bkfeng.top
version poll. UI is a thin shell over PyQt-free core logic.

core/update:
- manifest.py: fetch GitHub Release latest.json, pick current-platform asset
  (macOS arm64 → x64 fallback), compare versions; ghproxy mirror fallback.
- installer.py: download via core/download (sha256 verify) + extract +
  platform helper that waits for the running PID to exit, swaps the install
  dir/.app, clears macOS quarantine, and relaunches. onedir can't overwrite
  a running self, so the caller quits right after apply_update().
- downloader.py: add sha256 param (generalize _verify_sha1 → _verify_digest).

UI:
- update_banner.py: state machine 可用→下载中 NN%→重启并安装; failure retry;
  dev/non-writable installs degrade to "前往下载" (open Release page).
- update_thread.py: UpdateCheckThread (startup background check) +
  UpdateDownloadThread (progress + cancel).
- main_window: startup check + mandatory-update gating + closeEvent teardown.
- setting_interface "检查更新" now drives the same flow (was: open browser).
- delete version_checker_thread.py + app.announcement/old update i18n keys.

CI / release:
- gen_update_manifest.py + build-desktop.yml manifest job: generate latest.json
  with per-platform url/sha256/size and upload to the same Release.
- i18n: add app.update.* keys (zh_Hans/zh_Hant/en), --ignore-obsolete on update.

Verified: 47 unit tests (manifest/installer/downloader sha256); real jsDelivr
download + sha256 accept/reject; real helper disk swap on dead PID; banner
state machine on live Qt; manifest↔client round-trip; i18n 3-lang render; full
ruff/compileall green.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Bring back the "实时公告" capability lost in the update rebuild, without
reviving the dead vc.bkfeng.top server — fold it into the same latest.json
(editable anytime via `gh release upload --clobber`). Version control
(mandatory / min_supported) was already retained and is confirmed.

Announcements (core/update/manifest.py):
- fetch_manifest() → RemoteManifest{update, announcement} in one fetch;
  fetch_update() is now a thin wrapper.
- select_announcement(): enabled + content + start_date~end_date window
  (inclusive, blank side = unbounded, bad date = silent skip) + NEW
  min_version~max_version targeting (push a notice to a version range, e.g.
  nag old builds). id-keyed show-once (falls back to content hash).
- UpdateCheckThread emits announcementAvailable; main_window shows a
  ConfirmDialog once per id (dedup via version_state_cache). Announcement is
  independent of update — up-to-date users still receive it.
- gen_update_manifest.py: optional --announcement <json> to embed.

Review hardening (adversarial multi-agent review, 5 findings confirmed):
- update_banner: single-thread invariant — _teardown_dl() cancels+waits+
  disconnects the old download thread before cancel/retry/re-download, so a
  cancelled thread can't orphan and trigger "QThread destroyed while running"
  on exit or overwrite the new thread's progress. [medium]
- manifest: zero-pad version tuples before compare so (1,5)==(1,5,0) — fixes
  is_newer false-positive and hand-written 2-segment announcement targeting. [low]
- main_window: manual "检查更新" during an in-flight startup check now shows
  "正在检查更新…" instead of a dead click. [low]
- main_window: mandatory update disables home/batch only when can_self_update()
  — non-self-updatable installs keep working (closable "前往下载" banner)
  instead of a permanent dead-end. [low]
- (kept) announcement dedup is id-keyed once-ever: intentional, documented.

i18n: re-add app.announcement.title/got_it + app.update.checking (zh/zh_Hant/en).

Verified: 59 tests (announcement window/targeting/dedup/length-safe versions +
fetch_manifest); end-to-end gen_update_manifest --announcement → client parses
update+announcement, up-to-date user still gets it; banner teardown keeps state
machine intact; offscreen dialog render; full ruff/compile/i18n-check green.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Add proper install artifacts on top of the portable onedir zips. The zips
remain the auto-update payload; setup.exe / dmg are for first-time human
download only.

Windows (scripts/build_windows_installer.py + packaging/windows/VideoCaptioner.iss):
- Inno Setup 6, per-user install to %LOCALAPPDATA%\Programs\VideoCaptioner
  (PrivilegesRequired=lowest) — no admin prompt AND the dir stays writable, so
  the in-app rm+mv auto-update keeps working on an installed copy.
- Start Menu + optional desktop shortcut + uninstaller; stable AppId for
  in-place upgrades. Version/source/output injected via ISCC /D defines.

macOS (scripts/build_macos_dmg.py):
- ad-hoc codesign (codesign -s -) then hdiutil UDZO dmg with an /Applications
  symlink (drag-to-install). No Apple cert: ad-hoc signing only avoids the
  Apple-Silicon "app is damaged" HARD block, degrading to the recoverable
  "unidentified developer" (right-click → Open). It does NOT remove the
  Gatekeeper prompt — that needs a paid cert + notarization, out of scope.
  Auto-updated .apps clear quarantine in the helper, so only the first manual
  download needs the one-time right-click.

CI (build-desktop.yml): build installer (choco innosetup) / dmg per-OS after
smoke; upload artifacts/* (zip+exe+dmg) to the release. The manifest job still
only rglobs VideoCaptioner-*.zip, so exe/dmg never enter the update manifest.

Verified: macOS dmg built+mounted on real hardware (codesign --verify "valid on
disk", Signature=adhoc, .app + Applications symlink present); ruff/compileall
green; workflow YAML parses; adversarial multi-agent review of the .iss / build
scripts / CI wiring found 0 confirmed issues. Windows ISCC compile itself runs
only in CI (no local Windows).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
liangweifeng and others added 7 commits June 24, 2026 19:19
Surface 意见反馈 from the home page, not just 设置→关于. A HeaderLinkButton
(message icon) sits at the top-right of the home pivot row and opens the same
FeedbackDialog. Reuses the existing feedback.title string (no new i18n key).

Verified: home smoke renders the entry aligned with the tab row; ruff/compile green.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…changed

Revert the top-right pivot-row link added in 0b48a5e (top bar stays as-is) and
put the feedback entry in the home footer instead, next to 查看日志 / 捐助:
意见反馈 | 查看日志 | 捐助 (a FooterAction opening the same FeedbackDialog).
Reuses feedback.title (no new i18n key).

Verified: home smoke shows the footer entry with divider; top bar restored;
ruff/compile green.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…secret scrubbing

最近日志随反馈默认上传(独立 multipart logs 字段、静默、不在 UI 提示),后端
飞书表「最近日志」列已落地(真机 VC-0030 验证)。core/feedback/logs.py 取 app.log
尾部 ≤256KB,发送前脱敏。

对抗式审计(27 agent)后加固脱敏与契约一致性:
- scrubber 新增 base URL / endpoint、query ?key=&key=、裸 Google AIza 三类规则;
  真实日志实证:provider API base URL/密钥残留从 3+ 降到 0(公共站点 URL 保留以便排查)。
- platform_tag 收敛到契约枚举 windows-x64/macos-x64/macos-arm64(dev/linux 不外泄)。
- models.validate 的 12MB total 计入文本字段 + multipart 框架余量,成为后端上限真超集。
- 重写 docs/dev/feedback-api.md 为客户端实现说明,后端行为指向 vc-backend/api.md
  (删除已失实的 env 端点/幂等/429/issue 等早期实现者契约内容)。

测试:tests/test_feedback 全 25 通过(新增 logs 脱敏/采集、total 计入文本、platform 枚举)。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…rrency≤10)

新增公益翻译 provider「沉浸式翻译(免费)」,复用沉浸式翻译免费模型网关
(OpenAI 兼容,THUDM/GLM-4-9B-0414),用户无需配 key:
- 每安装随机生成并持久化 deviceId(模仿浏览器扩展,按设备分摊免费额度,不共用);
  用 deviceId 换 30min JWT(进程内缓存、过期/401 自动续),再调 chat 端点。
- 批量 JSON(dict-in/dict-out)+ 缺键重试 + 逐条兜底;并发由工厂封顶 10。
- 最小请求:换 token 仅需 query deviceId;chat 仅需 authorization: Bearer
  (原 curl 的 api-key/validtoken/x-imt-product-line/origin/cookie/UA 全冗余)。

接线:TranslatorServiceEnum/TranslatorType 同名成员、factory 分支、subtitle_thread
(_SERVICE_TO_TYPE + 归入 _NON_LLM_TRANSLATORS 免 key)、translator_from_cli、
TRANSLATOR_KEYS、CLI choices(subtitle/process/config-init 三处)、realtime 标签、
i18n 枚举 key(zh_Hans/en/zh_Hant 已填译并编译)。实时字幕下拉也自动可选。

对抗式验证后加固失败语义(高危):整块翻译失败时**抛错**而非把原文冒充译文返回——
否则 BaseTranslator 的 ≥50% 失败保护失效、且全原文块会被缓存 7 天毒化重试(免费端点
429 正走此路)。局部失败仍逐条补译、零星回退原文。_post_chat 对 200+无 choices 的
错误体明确抛错而非裸 KeyError。隐私核验:deviceId/JWT 永不进诊断/日志(已证实)。

测试:tests/test_translate/test_immersive_translator.py(JWT/token 缓存续期、批量守键
重试、整块失败抛错、局部回退原文、并发封顶;真机集成 env 门控)。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…split/optimize/translate)

公益大模型从独立翻译服务改为 LLM 提供商(LLM 配置里免 key 可选),断句/字幕优化/LLM
翻译全链路可免费使用;删除原 translate-service 版本。

- core/llm/free_model.py:deviceId→30min JWT 换发(过期/401 自动续);网关在 Cloudflare
  后,统一经 curl_cffi(浏览器 TLS 指纹)直连+给 OpenAI SDK 提供 httpx transport。
- get_llm_client 命中公益 base 时忽略占位 key、实时取令牌(指纹含令牌,换发自动重建)。
- LLM 配置页选中时隐藏 key/base/model 行,显示「免费·无需 API Key + 稳定性预防针」提示。
- CLI --translator/--llm 相应接线;curl_cffi 进正式依赖并在 spec 收集原生库与证书。

真机验证:curl_cffi 在 requests/httpx 被 CF 403 的网络下仍 200;端到端字幕翻译通过。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…update/announcement)

弃用 GitHub latest.json,改调自建后端 GET /api/update/check(backend.videocaptioner.cn,
飞书多维表格驱动):版本封禁、选最新版、公告时间窗全由后端算,客户端只渲染。

- core/update/client.py:一次请求拿 block/update/announcement;headers 带
  X-App-Version/Platform/Channel + X-Client-Id(与反馈复用)供统计;任何失败返回
  None 静默忽略。app_channel()=desktop/dev/pip(dev 也检查便于调试;pip 不下发更新)。
- block 非空=版本封禁:能自更新时锁死整个应用(stackedWidget 禁用)+ 提示条不可关;
  dev/pip 退化「前往下载」不锁死。原 mandatory/min_supported 客户端逻辑删除。
- CI 发版改为 register_release.py POST /api/admin/release 登记(需 CI_RELEASE_TOKEN
  secret),删除 gen_update_manifest.py/latest.json。
- 契约文档 docs/dev/update-api.md;AGENTS.md 同步。

真机验证:对真实后端 check 拿到 block+update+announcement 并正确解析渲染。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- FolderPickerControl:宽文字按钮换 34px 圆形图标按钮(打开/更改带 tooltip)、路径栏
  收窄,标题与描述不再被控件区挤压换行;路径文字改常规字重。
- feedback 模块注释精简,只保留反退化 WHY。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@cursor

cursor Bot commented Jul 4, 2026

Copy link
Copy Markdown

Bugbot is not enabled for this team, so this pull request was not reviewed.

Enable Bugbot in the Cursor dashboard to get automatic reviews on future PRs.

…e doesn't crash on Chinese labels

Windows runner 控制台默认 cp1252,_check_bundled_payload 打印中文依赖名 UnicodeEncodeError
崩溃 → 拦住 exe 产物上传。冒烟脚本顶部强制 UTF-8 输出。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@cursor

cursor Bot commented Jul 4, 2026

Copy link
Copy Markdown

Bugbot is not enabled for this team, so this pull request was not reviewed.

Enable Bugbot in the Cursor dashboard to get automatic reviews on future PRs.

onnxruntime 已停发 Intel-mac 轮子(1.24.3/1.27.0 mac 仅 arm64),macos-15-intel 上
uv sync --extra ocr 必失败。Intel Mac 已停产,改在 macos-15(arm64) 构建:onnxruntime
arm64 轮子可用,_platform_tag() 自动产出 macos-arm64 命名,对上更新后端平台键。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@cursor

cursor Bot commented Jul 4, 2026

Copy link
Copy Markdown

Bugbot is not enabled for this team, so this pull request was not reviewed.

Enable Bugbot in the Cursor dashboard to get automatic reviews on future PRs.

WEIFENG2333 and others added 17 commits July 5, 2026 20:56
…age imports

Windows Qt default font has no CJK glyphs (falls back to SimSun); load the
bundled LXGW WenKai for every platform and replace qfluent's getFont family
table. Move init_i18n above the page-module imports: module-level constants
evaluating tr()/N_() at import time froze raw keys otherwise.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
BATCH_MODES/STAGE_SPECS evaluated tr() at import time, before init_i18n,
so the page showed raw keys (batch.mode.*). Store N_()-marked keys and
translate at use sites via stage_title(); literal keys never sit inside
tr() call args (babel's token extractor would harvest them into the .pot).
Also gate the dubbing stage behind ffmpeg+ffprobe preflight.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Force UTF-8 for pybabel subprocesses (GBK default cannot read babel.cfg
with Chinese comments) and for console output. Guard extract behind
Python >= 3.12: older tokenizers silently miss tr() inside f-strings
and would drop keys from the .pot.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
text=True without encoding decodes with the locale codec (GBK on Chinese
Windows); UTF-8 bytes in ffmpeg/whisper output killed the reader thread,
leaving stderr=None and crashing downstream parsing. errors=replace keeps
stray bad bytes from ever raising again.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…_info

One line-based parser (probe_media -> MediaInfo) replaces three duplicated
regex parsers (get_video_info, both renderers) and the ffprobe JSON probe in
dubbing. Fixes en route: attached-pic cover art no longer misreads audio
files as video; fps falls back to tbr (never tbn); probe failures degrade
to None (OSError + 30s timeout) instead of raising into task threads;
thumbnail extraction split out of info probing.

ffprobe stays bundled solely for pydub (dubbing reads mp3 through
AudioSegment.from_file -> mediainfo_json); app code must not call it.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…th pinned sha256

ghproxy.com died serving HTTP 200 HTML pages that got saved as .zip and
failed extraction. Replace with tested-alive mirrors (gh-proxy.com,
ghfast.top), add a validate hook to download_file so non-archive bodies
fail over to the next mirror, and verify archives before install.

Bump voxgate to v0.3.0 (protocol regressed against a real transcription),
pin per-platform sha256 from the release checksums, and treat an older
BIN_PATH install as not-ready so the doctor download button doubles as
the upgrade path (version probe cached by path+mtime to keep is_installed
off the GUI thread's critical path).

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Classify browser-cookie read failures by real cause (database locked by a
running browser vs Chromium 127+ app-bound encryption vs macOS TCC) and
say so; macOS-only privacy guidance no longer shows on Windows. Every
final message ends with the reliable ways out (Firefox login or
cookies.txt). Keep messages to cause + one action; details go to logs.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…title bar

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
… min-height

ImageLabel sizes itself via setFixedSize, so keeping it in a layout either
pinned the page (and the whole window, via the stacked widget max) to the
preview size — an unshrinkable-window deadlock — or collapsed it to a bogus
sizeHint. Take the preview out of layout management: _fit_preview scales and
centers it manually, driven by the stage's own resized signal so internal
relayouts (not just window resizes) refit it.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
MediaThumb grows a fit="contain" mode (theme-colored letterbox, no crop)
for the result preview; small media cards keep the cover crop. Dubbing's
ffprobe preflight restored on the synthesis page (pydub needs it).

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…etting

The error signal set the error page, then finished immediately reset the
'empty session' back to ready — the user saw nothing when voxgate was
missing. Latch the error state so finished leaves it alone, and point the
voxgate-missing message at the doctor page's one-click download.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
The immersive provider hid the whole test row along with its config rows;
keep '测试连接' visible (only '加载模型' hides — the model is fixed) and
route the check through the real path: placeholder key swaps for a live
token and the request rides the curl_cffi transport past Cloudflare.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Diagnostic rows: fixed 78px clipped wrapped descriptions, but free-growing
rows let the container stretch them apart — min-height + vertical Maximum
policy makes height content-driven both ways. Drop the selectable flag on
row labels (with wordWrap it inflated line height under LXGW WenKai).
Download help becomes a stay-until-closed toast carrying the local
cookies.txt path; ffprobe rejoins the FFmpeg status roll-up; the hardsub
engine-missing card centers at natural size instead of filling the stage.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Sidebar/SidebarItem: expand/collapse with width animation, icons anchored
to the collapsed rail's centerline so nothing shifts between states, labels
fade with expand progress, active pill + hover from app_palette (light and
dark), tooltips only when collapsed. Pages register as selectable items,
footer actions (GitHub / settings) just fire callbacks.

MainWindow hides the qfluent NavigationInterface, mounts the sidebar in
its place, keeps stackedWidget switching in both directions (sidebar click
and programmatic switchTo), and auto-collapses under narrow windows —
manual collapse is remembered and not overridden. ui_smoke_check's
navigation check now exercises this component.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
… keys

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…otcha

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
The fixed 46px inset was tuned for the old narrow rail; the expanded
sidebar (224px) covered the title text. Follow the sidebar's live width
(event filter keeps it glued during the expand/collapse animation).

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
@cursor

cursor Bot commented Jul 5, 2026

Copy link
Copy Markdown

Bugbot is not enabled for this team, so this pull request was not reviewed.

Enable Bugbot in the Cursor dashboard to get automatic reviews on future PRs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant