VideoCaptioner 2.0: workbench UI, dubbing, unified naming, core download engine#1150
VideoCaptioner 2.0: workbench UI, dubbing, unified naming, core download engine#1150WEIFENG2333 wants to merge 63 commits into
Conversation
- Output naming: one grammar ({stem}.{tag}.{ext}) shared by CLI and GUI via
core/application/output_paths.py; intermediates live in per-run task dirs
under work_dir/tasks/ and are cleaned on success (app.keep_intermediates
opt-out); raw TTS segments become a content-addressed cache in CACHE_PATH.
- Download: yt-dlp engine moved to core/download/media.py (no Qt), shared by
the GUI thread, CLI download, and diagnostics; browser-cookie fallback
ladder lives in net.py so the source check only reports unavailable after
the same ladder real downloads use; error messages strip ANSI, keep the
original site error as the headline, and separate TCC permission failures
from real rejections.
- Packaging: wheel now ships resource/fonts (LXGW WenKai default style font),
declares httpx, drops unused aiohttp; PyInstaller spec console=False and
stale modelscope hidden import removed; verified by real wheel install,
frozen build, and windowed GUI launches.
- Assets/docs: voice previews re-encoded wav->mp3 (8.4MB -> 2.3MB); README
preview images cropped with transparent rounded corners; docs/dev pruned to
adopted design mocks renamed design-<page>.html; CONTRIBUTING added;
core/realtime removed.
- UI: settings folder rows use a shared read-only path + open/change control.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: a56600940c
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| @@ -0,0 +1,135 @@ | |||
| from __future__ import annotations | |||
|
|
|||
| from enum import StrEnum | |||
There was a problem hiding this comment.
Replace StrEnum for Python 3.10 support
This project still declares requires-python = ">=3.10,<3.13", but enum.StrEnum is only available starting in Python 3.11. On a supported Python 3.10 install, importing videocaptioner.ui.common.app_icons raises ImportError, which prevents UI modules that use AppIcon from loading. Use a str, Enum subclass or raise the package's minimum Python version.
Useful? React with 👍 / 👎.
| "auto", | ||
| ChoiceValidator(["auto", "first", "second"]), | ||
| ) | ||
| dubbing_timing = ChoiceSettingField( |
There was a problem hiding this comment.
Preserve the supported no-timing mode in settings
The core/CLI path accepts dubbing.timing = "none" (resolve_timing() and the CLI --timing none both support it), but the settings state only allows these three values. If a user sets none from the CLI/TOML and later opens the GUI settings, ChoiceValidator coerces it to the first option (natural), and the next settings save overwrites the config, unexpectedly re-enabling time fitting. Include the none option or avoid correcting supported config values here.
Useful? React with 👍 / 👎.
A bare exception escaping QThread.run aborts the whole process (qFatal). RoundedBgPreviewThread already had the try/except guard with a comment claiming parity with AssPreviewThread — which was in fact unguarded. Verified by forcing the renderer to raise: thread exits cleanly, no emit. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
- Rebuild the subtitle-style page's library dock and style cards: always-on actions (copy/rename/delete with icons), a divider and status pill, no description blurb; built-in cards get a duplicate action so they no longer look empty. Cards are exact-fit so the active card's border is never clipped and the dock has no blank band. - Preview now fits the stage edge-to-edge, adapting to wide/short/ fullscreen and capping at native resolution to avoid upscaling blur. - Add ASS/rounded style params end-to-end: alignment, max width and primary/secondary line gap (style_manager, renderers, asr_data, entities, task_builder, video synthesis thread + page). - New first-party color picker (subtitle preset palette + recent colors) wired into the inspector and settings color controls. - New first-party AppLineEdit (single focus ring, no stuck border); migrate BoundLineEdit onto it. - workbench: StepperControl, CompactButton pad_h option, AppLineEdit. - Rename subtitle_style_controls.py -> inspector_controls.py. - Fix sidebar subtitle icon via AppFluentIcon(AppIcon.SUBTITLE). - config: recent_colors + subtitle preview source/target fields. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
get_prompt() fills ${target_language} via Template.safe_substitute, which
str()-ifies the value. llm_translator passed the TargetLanguage enum
member, so prompts rendered "...specializing in TargetLanguage.SIMPLIFIED_
CHINESE..." instead of the language name — silently degrading every LLM
translation (standard/reflect/single). Pass .value so the prompt reads
"...specializing in 简体中文...".
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
- app_icons.AppIcon used enum.StrEnum (3.11+) while the project declares requires-python ">=3.10"; on 3.10 the import raises ImportError and every UI module importing AppIcon fails to load. Switch to (str, Enum) with a __str__ returning the value, preserving StrEnum semantics that path/cache lookups rely on (str(icon) -> "subtitle"). - dubbing_timing ChoiceValidator omitted "none" though core/CLI fully support it (resolve_timing, --timing none); a CLI/TOML-set "none" was coerced to "natural" on opening GUI settings and silently re-enabled time fitting on save. Add "none" to the validator and the settings dropdown. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Settings dialog (open/close jank):
- Fade only the cheap window mask, not the heavy 940x680 card+shadow.
MaskDialogBase ramped opacity on the whole dialog, re-rasterizing the
card every frame (~9ms each) — content appeared to "load late". Card now
pops at full opacity (timely), shadow is static so it can be softer.
Subtitle style duplication-on-launch:
- Built-in styles were force-forked with an incrementing id every time the
selection reset to a built-in (mode round-trips dropped the per-mode
selection), piling up default-custom-2..N each session.
- Remember selection per renderer mode (restored on ass<->rounded round
trips) and fork built-ins to a single deterministic {builtin}-custom id
(reused, not incremented). Adds tests/test_ui/test_subtitle_style_persistence.
Config persistence:
- Bind subtitle_preview_source/target and recent_colors (were SettingField
+ cfg.set(save=True) no-ops) so custom preview text and recent colors
survive restart; add matching config_store DEFAULTS.
Review-found fixes:
- settings_controls/form_cards/home/llm_logs/subtitle: replace broken QSS
numeric font-weight (Qt5 maps 700-800 to weight 88-99, over-bold) with the
valid `bold` keyword.
- download/media: route subtitle re-download through the per-site proxy
(YouTube needs proxy, Bilibili forced direct); fix sanitize_filename
control-char regex ([\0-\31] octal bug missed 0x1a-0x1f).
- config: normalize llm.service aliases (siliconcloud/lmstudio) before the
generic api_key backfill.
- dubbing_interface: remove dead metaLabel + orphan QSS rules.
Verified: ruff clean, compileall, 336 passed/3 skipped (isolated config),
dark/light smoke on touched pages.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…view
Transcribe source-language per interface:
- Add TRANSCRIBE_MODEL_LANGUAGES / transcribe_languages_for in core as the
single source of truth: B接口(必剪)/J接口(剪映) are zh/en-only cloud ASR
(they ignore the language param entirely), others support all languages.
- Settings 源语言 dropdown now narrows to {自动检测/中文/英语} for B/J and
resets an unsupported selection to AUTO on provider switch.
Independent bold for primary/secondary ASS lines:
- One `bold` flag was applied to BOTH lines while the toggle lived only under
主字幕 — confusing. Add AssSecondaryStyle.bold + a 副字幕 加粗 toggle so each
line is controlled separately. Legacy styles (no secondary bold) inherit the
primary bold on load, so existing renders are unchanged.
Faster style preview:
- Content-addressed preview cache (core/subtitle/preview_cache.py): identical
style+text+bg renders once; ASS<->rounded switching / repeat edits hit the
cached PNG (~0.1ms) instead of re-running ffmpeg(~250ms)/PIL(~67ms). Bounded
to 24 files. Also removes the rounded renderer's per-call tempfile leak.
- Leading-edge debounce in update_preview: first render after idle is immediate
(no blank on open/switch), rapid bursts coalesce to one trailing render so
ASS(ffmpeg) renders don't pile up.
Preview default text changed to a neutral science/math line matching the
bundled demo background; PREVIEW_TEXT fallback now derives from the config
default (was a 3rd hardcoded copy).
Tests: transcribe language support, primary/secondary bold + legacy inherit +
json round-trip, preview cache (determinism/prune/hit). ruff + 158 passed.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
实时字幕(Live Caption)完整功能,外加一批健壮性与代码质量打磨。 Live Caption(实时字幕): - core/realtime: 后端无关协议(TranscriptSegment/CaptionEntry)、voxgate / fun-asr / qwen-asr 三后端、CaptionAssembler(按 seg_id upsert + 双色 + 异步翻译)、会话编排、录制/历史/回放、macOS 系统声音(ScreenCaptureKit) - UI: 桌面浮窗(standard/tall + 窗内设置)、会话/历史/详情三视图、控制页 - doctor: 实时字幕连通性检测(CLI + 诊断页) 依赖与构建: - core/download/dependencies.py: ffmpeg/voxgate 下载注册表 + 下载弹窗 - native/macsysaudio: ScreenCaptureKit helper(arm64+x86_64 universal), 源码与产物入库 - build_desktop.py: 打包 macsysaudio 健壮性与质量: - WS 后端共享 base._reconnect_with_backoff(退避 + 限频 + 失败上限);qwen "Connection lost" 经真实 API soak 证实是网络而非代码 - 通用调试开关 VC_DEBUG(core/utils/debug.py)取代专有 VC_LIVE_DEBUG - 注释精简(删调试轶事/反向辩解),清除代码中所有 design-mock HTML 引用 - AGENTS.md: 更新设计稿与注释约定、后端列表、文件路径 其他: LLM client 按指纹重建单例、translate factory、settings/dubbing 等改动 Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
work_dir reorganization:
- Task dirs now group by function under work_dir: transcribe / synthesis /
batch / dubbing (output_paths.new_task_dir validates task_type; cleanup
only rmtrees dirs whose parent is a known task_type). Routed by entry page.
- Live caption history moved from APPDATA into {work_dir}/live-caption;
default_root() helper; migrate_legacy_root made recoverable and called
once on GUI startup (main.py, not the widget ctor) so tests can't move
real user data.
WS backend refactor:
- New WebSocketTranscriber base (ws_base.py) collapses ~50 lines duplicated
across fun_asr / qwen_asr: WS connection state, recv loop, locked send,
failure teardown. voxgate keeps inheriting LiveTranscriber directly so it
never pulls websocket into its startup path (backends stay lazy).
- Unify scattered `_closed = True` teardown into _fail_close() (close ws +
emit STOPPED), matching stop()'s terminal state; the old path leaked the
ws fd and never told the UI it stopped.
Review fixes:
- base: accept-then-drop reconnect storms (attempt succeeds but connection
instantly drops, so the fail counter never trips) now terminate via a
cross-call _reconnect_streak; previously looped every 0.5s forever, never
emitting text or an error. Covered by a new test.
- live_caption_thread.stop(): longer default wait so voxgate can flush the
last sentence before the receive thread is joined.
- caption_overlay: parent the settings popover to the overlay so it is
reclaimed with it instead of leaking an orphan top-level window.
- system_mac: cap helper stderr buffer with a bounded deque.
- recorder: guard against a zero-length cue when the last sentence first
appears during stop (clicking it used to jump to the end).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
core/hardsub: frame sampling -> change-point OCR -> dedup -> region
auto-detect -> timeline -> ASRData. Region detection and per-frame line
filtering share one "centered-segment + dominant-font" model; junk
(cards/counts/durations/translations/credits) is dropped by centering trim,
global font gate, frame-relative font gate, and a stricter single-line
centering gate. Manual ROI (GUI drag / CLI --roi) is WYSIWYG (roi_is_manual).
core/ocr: RapidOCR(onnxruntime) engine, lazy-loaded; behind optional `ocr`
extra (rapidocr/onnxruntime/rapidfuzz) so base install/startup is unaffected.
GUI: 硬字幕提取 page (ROI selector, result table, redo, send-to-optimize),
nav icon, smoke wiring; closeEvent stops its threads to avoid exit abort.
CLI: extract-hardsub command. Output named {stem}.hardsub.srt.
Speed (~2x): region sample 36->20, reuse the font size learned during region
detection, det_limit 768 for the extraction engine (960 kept for detection).
Incidental: shared config.find_binary (voxgate/macsysaudio/ffmpeg/deps now
share one discovery path); yt-dlp postprocessor hook for merge progress.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ant)
把已失效的 Qt .ts/.qm 翻译机制换成 key-based gettext。只翻 UI(PyQt),
core 与 CLI 维持中文/英文现状。
- 运行时 videocaptioner/ui/i18n/(tr/N_/init/set_language,标准库 gettext);
main.py 装 catalog + FluentTranslator;非基准语言缺译回退 zh_Hans 中文(不显示 key)。
- ~1429 个 key 覆盖 38 个 UI 文件(tr("域.语义"));枚举下拉标签走
ui/common/enum_labels.py(程序化 enum_key + TRANSLATABLE_ENUMS);动态 key
(配音 provider/voice/tag、识别语言)由注册表注入;tr(常量) 的 key 用 N_ 标记。
- 语言切换 = 确认后自动重启(不做逐页 retranslate)。
- 工具链 scripts/i18n.py(extract/update/fill-base/translate/compile/check)+
babel.cfg + Babel dev 依赖;CI 跑 i18n check + tests/test_i18n。
- zh_Hans 为中文真相源;en/zh_Hant 机器翻译(OpenAI 兼容端点,VC_TRANSLATE_* 配置)。
- 删旧 resource/translations/*.ts|qm + scripts/trans-*.sh + translate_llm.py;
顺手删无引用的 SQUARE_BUTTON_SIZE / apply_button_icon。
- 文档:docs/dev/i18n-workflow.md(维护手册)、docs/dev/i18n-plan.md(架构)、
AGENTS.md i18n 章节。
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
桌面 onedir 包之前缺新功能依赖(实测:CI uv sync 不带 --extra ocr → 硬字幕 OCR 100% 不可用)。
- build-desktop.yml: uv sync/run 加 --extra ocr,把 rapidocr/onnxruntime/rapidfuzz 装进构建环境。
- VideoCaptioner.spec: collect_data_files('rapidocr')(含自带 PP-OCR 模型,离线可用)+
collect_dynamic_libs('onnxruntime') + collect_submodules('yt_dlp')(~900 extractor)+
sounddevice/_sounddevice_data/cffi/websocket/numpy hiddenimports;_safe() 守卫使 base 包仍可打。
- smoke_desktop.py: 校验 onnxruntime/rapidocr 模型/PortAudio 必须真进包,堵住「没进包也 CI 绿」。
- pyproject: 修 rapidocr 过期注释(3.x 自带模型,非运行时下载)。
- 文档 docs/dev/packaging-and-update-plan.md:打包现状/缺口/单文件结论/目录设计/更新重构方案。
真打包验证(mac onedir, 487M):依赖全进包;smoke 全过;打包 exe synthesize hard →
extract-hardsub 端到端 OCR 跑通(正确识别烧录中文字幕),无 cv2×Qt 冲突。
voxgate 按设计运行时下载(不随包)。
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…art-install) Rebuild the update mechanism end to end, replacing the old vc.bkfeng.top version poll. UI is a thin shell over PyQt-free core logic. core/update: - manifest.py: fetch GitHub Release latest.json, pick current-platform asset (macOS arm64 → x64 fallback), compare versions; ghproxy mirror fallback. - installer.py: download via core/download (sha256 verify) + extract + platform helper that waits for the running PID to exit, swaps the install dir/.app, clears macOS quarantine, and relaunches. onedir can't overwrite a running self, so the caller quits right after apply_update(). - downloader.py: add sha256 param (generalize _verify_sha1 → _verify_digest). UI: - update_banner.py: state machine 可用→下载中 NN%→重启并安装; failure retry; dev/non-writable installs degrade to "前往下载" (open Release page). - update_thread.py: UpdateCheckThread (startup background check) + UpdateDownloadThread (progress + cancel). - main_window: startup check + mandatory-update gating + closeEvent teardown. - setting_interface "检查更新" now drives the same flow (was: open browser). - delete version_checker_thread.py + app.announcement/old update i18n keys. CI / release: - gen_update_manifest.py + build-desktop.yml manifest job: generate latest.json with per-platform url/sha256/size and upload to the same Release. - i18n: add app.update.* keys (zh_Hans/zh_Hant/en), --ignore-obsolete on update. Verified: 47 unit tests (manifest/installer/downloader sha256); real jsDelivr download + sha256 accept/reject; real helper disk swap on dead PID; banner state machine on live Qt; manifest↔client round-trip; i18n 3-lang render; full ruff/compileall green. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Bring back the "实时公告" capability lost in the update rebuild, without
reviving the dead vc.bkfeng.top server — fold it into the same latest.json
(editable anytime via `gh release upload --clobber`). Version control
(mandatory / min_supported) was already retained and is confirmed.
Announcements (core/update/manifest.py):
- fetch_manifest() → RemoteManifest{update, announcement} in one fetch;
fetch_update() is now a thin wrapper.
- select_announcement(): enabled + content + start_date~end_date window
(inclusive, blank side = unbounded, bad date = silent skip) + NEW
min_version~max_version targeting (push a notice to a version range, e.g.
nag old builds). id-keyed show-once (falls back to content hash).
- UpdateCheckThread emits announcementAvailable; main_window shows a
ConfirmDialog once per id (dedup via version_state_cache). Announcement is
independent of update — up-to-date users still receive it.
- gen_update_manifest.py: optional --announcement <json> to embed.
Review hardening (adversarial multi-agent review, 5 findings confirmed):
- update_banner: single-thread invariant — _teardown_dl() cancels+waits+
disconnects the old download thread before cancel/retry/re-download, so a
cancelled thread can't orphan and trigger "QThread destroyed while running"
on exit or overwrite the new thread's progress. [medium]
- manifest: zero-pad version tuples before compare so (1,5)==(1,5,0) — fixes
is_newer false-positive and hand-written 2-segment announcement targeting. [low]
- main_window: manual "检查更新" during an in-flight startup check now shows
"正在检查更新…" instead of a dead click. [low]
- main_window: mandatory update disables home/batch only when can_self_update()
— non-self-updatable installs keep working (closable "前往下载" banner)
instead of a permanent dead-end. [low]
- (kept) announcement dedup is id-keyed once-ever: intentional, documented.
i18n: re-add app.announcement.title/got_it + app.update.checking (zh/zh_Hant/en).
Verified: 59 tests (announcement window/targeting/dedup/length-safe versions +
fetch_manifest); end-to-end gen_update_manifest --announcement → client parses
update+announcement, up-to-date user still gets it; banner teardown keeps state
machine intact; offscreen dialog render; full ruff/compile/i18n-check green.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Add proper install artifacts on top of the portable onedir zips. The zips remain the auto-update payload; setup.exe / dmg are for first-time human download only. Windows (scripts/build_windows_installer.py + packaging/windows/VideoCaptioner.iss): - Inno Setup 6, per-user install to %LOCALAPPDATA%\Programs\VideoCaptioner (PrivilegesRequired=lowest) — no admin prompt AND the dir stays writable, so the in-app rm+mv auto-update keeps working on an installed copy. - Start Menu + optional desktop shortcut + uninstaller; stable AppId for in-place upgrades. Version/source/output injected via ISCC /D defines. macOS (scripts/build_macos_dmg.py): - ad-hoc codesign (codesign -s -) then hdiutil UDZO dmg with an /Applications symlink (drag-to-install). No Apple cert: ad-hoc signing only avoids the Apple-Silicon "app is damaged" HARD block, degrading to the recoverable "unidentified developer" (right-click → Open). It does NOT remove the Gatekeeper prompt — that needs a paid cert + notarization, out of scope. Auto-updated .apps clear quarantine in the helper, so only the first manual download needs the one-time right-click. CI (build-desktop.yml): build installer (choco innosetup) / dmg per-OS after smoke; upload artifacts/* (zip+exe+dmg) to the release. The manifest job still only rglobs VideoCaptioner-*.zip, so exe/dmg never enter the update manifest. Verified: macOS dmg built+mounted on real hardware (codesign --verify "valid on disk", Signature=adhoc, .app + Applications symlink present); ruff/compileall green; workflow YAML parses; adversarial multi-agent review of the .iss / build scripts / CI wiring found 0 confirmed issues. Windows ISCC compile itself runs only in CI (no local Windows). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Surface 意见反馈 from the home page, not just 设置→关于. A HeaderLinkButton (message icon) sits at the top-right of the home pivot row and opens the same FeedbackDialog. Reuses the existing feedback.title string (no new i18n key). Verified: home smoke renders the entry aligned with the tab row; ruff/compile green. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…changed Revert the top-right pivot-row link added in 0b48a5e (top bar stays as-is) and put the feedback entry in the home footer instead, next to 查看日志 / 捐助: 意见反馈 | 查看日志 | 捐助 (a FooterAction opening the same FeedbackDialog). Reuses feedback.title (no new i18n key). Verified: home smoke shows the footer entry with divider; top bar restored; ruff/compile green. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…secret scrubbing 最近日志随反馈默认上传(独立 multipart logs 字段、静默、不在 UI 提示),后端 飞书表「最近日志」列已落地(真机 VC-0030 验证)。core/feedback/logs.py 取 app.log 尾部 ≤256KB,发送前脱敏。 对抗式审计(27 agent)后加固脱敏与契约一致性: - scrubber 新增 base URL / endpoint、query ?key=&key=、裸 Google AIza 三类规则; 真实日志实证:provider API base URL/密钥残留从 3+ 降到 0(公共站点 URL 保留以便排查)。 - platform_tag 收敛到契约枚举 windows-x64/macos-x64/macos-arm64(dev/linux 不外泄)。 - models.validate 的 12MB total 计入文本字段 + multipart 框架余量,成为后端上限真超集。 - 重写 docs/dev/feedback-api.md 为客户端实现说明,后端行为指向 vc-backend/api.md (删除已失实的 env 端点/幂等/429/issue 等早期实现者契约内容)。 测试:tests/test_feedback 全 25 通过(新增 logs 脱敏/采集、total 计入文本、platform 枚举)。 Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…rrency≤10) 新增公益翻译 provider「沉浸式翻译(免费)」,复用沉浸式翻译免费模型网关 (OpenAI 兼容,THUDM/GLM-4-9B-0414),用户无需配 key: - 每安装随机生成并持久化 deviceId(模仿浏览器扩展,按设备分摊免费额度,不共用); 用 deviceId 换 30min JWT(进程内缓存、过期/401 自动续),再调 chat 端点。 - 批量 JSON(dict-in/dict-out)+ 缺键重试 + 逐条兜底;并发由工厂封顶 10。 - 最小请求:换 token 仅需 query deviceId;chat 仅需 authorization: Bearer (原 curl 的 api-key/validtoken/x-imt-product-line/origin/cookie/UA 全冗余)。 接线:TranslatorServiceEnum/TranslatorType 同名成员、factory 分支、subtitle_thread (_SERVICE_TO_TYPE + 归入 _NON_LLM_TRANSLATORS 免 key)、translator_from_cli、 TRANSLATOR_KEYS、CLI choices(subtitle/process/config-init 三处)、realtime 标签、 i18n 枚举 key(zh_Hans/en/zh_Hant 已填译并编译)。实时字幕下拉也自动可选。 对抗式验证后加固失败语义(高危):整块翻译失败时**抛错**而非把原文冒充译文返回—— 否则 BaseTranslator 的 ≥50% 失败保护失效、且全原文块会被缓存 7 天毒化重试(免费端点 429 正走此路)。局部失败仍逐条补译、零星回退原文。_post_chat 对 200+无 choices 的 错误体明确抛错而非裸 KeyError。隐私核验:deviceId/JWT 永不进诊断/日志(已证实)。 测试:tests/test_translate/test_immersive_translator.py(JWT/token 缓存续期、批量守键 重试、整块失败抛错、局部回退原文、并发封顶;真机集成 env 门控)。 Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…split/optimize/translate) 公益大模型从独立翻译服务改为 LLM 提供商(LLM 配置里免 key 可选),断句/字幕优化/LLM 翻译全链路可免费使用;删除原 translate-service 版本。 - core/llm/free_model.py:deviceId→30min JWT 换发(过期/401 自动续);网关在 Cloudflare 后,统一经 curl_cffi(浏览器 TLS 指纹)直连+给 OpenAI SDK 提供 httpx transport。 - get_llm_client 命中公益 base 时忽略占位 key、实时取令牌(指纹含令牌,换发自动重建)。 - LLM 配置页选中时隐藏 key/base/model 行,显示「免费·无需 API Key + 稳定性预防针」提示。 - CLI --translator/--llm 相应接线;curl_cffi 进正式依赖并在 spec 收集原生库与证书。 真机验证:curl_cffi 在 requests/httpx 被 CF 403 的网络下仍 200;端到端字幕翻译通过。 Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…update/announcement) 弃用 GitHub latest.json,改调自建后端 GET /api/update/check(backend.videocaptioner.cn, 飞书多维表格驱动):版本封禁、选最新版、公告时间窗全由后端算,客户端只渲染。 - core/update/client.py:一次请求拿 block/update/announcement;headers 带 X-App-Version/Platform/Channel + X-Client-Id(与反馈复用)供统计;任何失败返回 None 静默忽略。app_channel()=desktop/dev/pip(dev 也检查便于调试;pip 不下发更新)。 - block 非空=版本封禁:能自更新时锁死整个应用(stackedWidget 禁用)+ 提示条不可关; dev/pip 退化「前往下载」不锁死。原 mandatory/min_supported 客户端逻辑删除。 - CI 发版改为 register_release.py POST /api/admin/release 登记(需 CI_RELEASE_TOKEN secret),删除 gen_update_manifest.py/latest.json。 - 契约文档 docs/dev/update-api.md;AGENTS.md 同步。 真机验证:对真实后端 check 拿到 block+update+announcement 并正确解析渲染。 Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- FolderPickerControl:宽文字按钮换 34px 圆形图标按钮(打开/更改带 tooltip)、路径栏 收窄,标题与描述不再被控件区挤压换行;路径文字改常规字重。 - feedback 模块注释精简,只保留反退化 WHY。 Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
Bugbot is not enabled for this team, so this pull request was not reviewed. Enable Bugbot in the Cursor dashboard to get automatic reviews on future PRs. |
…e doesn't crash on Chinese labels Windows runner 控制台默认 cp1252,_check_bundled_payload 打印中文依赖名 UnicodeEncodeError 崩溃 → 拦住 exe 产物上传。冒烟脚本顶部强制 UTF-8 输出。 Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
Bugbot is not enabled for this team, so this pull request was not reviewed. Enable Bugbot in the Cursor dashboard to get automatic reviews on future PRs. |
onnxruntime 已停发 Intel-mac 轮子(1.24.3/1.27.0 mac 仅 arm64),macos-15-intel 上 uv sync --extra ocr 必失败。Intel Mac 已停产,改在 macos-15(arm64) 构建:onnxruntime arm64 轮子可用,_platform_tag() 自动产出 macos-arm64 命名,对上更新后端平台键。 Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
Bugbot is not enabled for this team, so this pull request was not reviewed. Enable Bugbot in the Cursor dashboard to get automatic reviews on future PRs. |
…age imports Windows Qt default font has no CJK glyphs (falls back to SimSun); load the bundled LXGW WenKai for every platform and replace qfluent's getFont family table. Move init_i18n above the page-module imports: module-level constants evaluating tr()/N_() at import time froze raw keys otherwise. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
BATCH_MODES/STAGE_SPECS evaluated tr() at import time, before init_i18n, so the page showed raw keys (batch.mode.*). Store N_()-marked keys and translate at use sites via stage_title(); literal keys never sit inside tr() call args (babel's token extractor would harvest them into the .pot). Also gate the dubbing stage behind ffmpeg+ffprobe preflight. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Force UTF-8 for pybabel subprocesses (GBK default cannot read babel.cfg with Chinese comments) and for console output. Guard extract behind Python >= 3.12: older tokenizers silently miss tr() inside f-strings and would drop keys from the .pot. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
text=True without encoding decodes with the locale codec (GBK on Chinese Windows); UTF-8 bytes in ffmpeg/whisper output killed the reader thread, leaving stderr=None and crashing downstream parsing. errors=replace keeps stray bad bytes from ever raising again. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…_info One line-based parser (probe_media -> MediaInfo) replaces three duplicated regex parsers (get_video_info, both renderers) and the ffprobe JSON probe in dubbing. Fixes en route: attached-pic cover art no longer misreads audio files as video; fps falls back to tbr (never tbn); probe failures degrade to None (OSError + 30s timeout) instead of raising into task threads; thumbnail extraction split out of info probing. ffprobe stays bundled solely for pydub (dubbing reads mp3 through AudioSegment.from_file -> mediainfo_json); app code must not call it. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…th pinned sha256 ghproxy.com died serving HTTP 200 HTML pages that got saved as .zip and failed extraction. Replace with tested-alive mirrors (gh-proxy.com, ghfast.top), add a validate hook to download_file so non-archive bodies fail over to the next mirror, and verify archives before install. Bump voxgate to v0.3.0 (protocol regressed against a real transcription), pin per-platform sha256 from the release checksums, and treat an older BIN_PATH install as not-ready so the doctor download button doubles as the upgrade path (version probe cached by path+mtime to keep is_installed off the GUI thread's critical path). Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Classify browser-cookie read failures by real cause (database locked by a running browser vs Chromium 127+ app-bound encryption vs macOS TCC) and say so; macOS-only privacy guidance no longer shows on Windows. Every final message ends with the reliable ways out (Firefox login or cookies.txt). Keep messages to cause + one action; details go to logs. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…title bar Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
… min-height ImageLabel sizes itself via setFixedSize, so keeping it in a layout either pinned the page (and the whole window, via the stacked widget max) to the preview size — an unshrinkable-window deadlock — or collapsed it to a bogus sizeHint. Take the preview out of layout management: _fit_preview scales and centers it manually, driven by the stage's own resized signal so internal relayouts (not just window resizes) refit it. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
MediaThumb grows a fit="contain" mode (theme-colored letterbox, no crop) for the result preview; small media cards keep the cover crop. Dubbing's ffprobe preflight restored on the synthesis page (pydub needs it). Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…etting The error signal set the error page, then finished immediately reset the 'empty session' back to ready — the user saw nothing when voxgate was missing. Latch the error state so finished leaves it alone, and point the voxgate-missing message at the doctor page's one-click download. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
The immersive provider hid the whole test row along with its config rows; keep '测试连接' visible (only '加载模型' hides — the model is fixed) and route the check through the real path: placeholder key swaps for a live token and the request rides the curl_cffi transport past Cloudflare. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Diagnostic rows: fixed 78px clipped wrapped descriptions, but free-growing rows let the container stretch them apart — min-height + vertical Maximum policy makes height content-driven both ways. Drop the selectable flag on row labels (with wordWrap it inflated line height under LXGW WenKai). Download help becomes a stay-until-closed toast carrying the local cookies.txt path; ffprobe rejoins the FFmpeg status roll-up; the hardsub engine-missing card centers at natural size instead of filling the stage. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Sidebar/SidebarItem: expand/collapse with width animation, icons anchored to the collapsed rail's centerline so nothing shifts between states, labels fade with expand progress, active pill + hover from app_palette (light and dark), tooltips only when collapsed. Pages register as selectable items, footer actions (GitHub / settings) just fire callbacks. MainWindow hides the qfluent NavigationInterface, mounts the sidebar in its place, keeps stackedWidget switching in both directions (sidebar click and programmatic switchTo), and auto-collapses under narrow windows — manual collapse is remembered and not overridden. ui_smoke_check's navigation check now exercises this component. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
… keys Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…otcha Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
The fixed 46px inset was tuned for the old narrow rail; the expanded sidebar (224px) covered the title text. Follow the sidebar's live width (event filter keeps it glued during the expand/collapse animation). Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
|
Bugbot is not enabled for this team, so this pull request was not reviewed. Enable Bugbot in the Cursor dashboard to get automatic reviews on future PRs. |
Summary
The 2.0 branch: first-party workbench UI across all pages, a full dubbing workflow (Edge / Gemini / SiliconFlow CosyVoice with voice clone), and the supporting architecture work below.
Architecture
AppConfigdataclass →TaskBuilder; CLI and GUI consume the same canonical config through thin adapters.{stem}.{tag}.{ext}(tags: lang code / optimized / subtitled / dubbed) defined once incore/application/output_paths.py; intermediates live in per-run task directories and are cleaned on success; raw TTS segments become a content-addressed cache.core/download/media.py(no Qt) shared by GUI thread, CLIdownload, and diagnostics; one browser-cookie fallback ladder so "diagnostics says OK" matches real download behavior.WorkerThreadbase with cooperative cancellation.Packaging
resource/fonts(default style font LXGW WenKai), declareshttpx, drops unusedaiohttp; verified by real wheel install into a clean venv + windowed GUI launch.console=False, stale hidden import removed; verified by a real desktop build, frozen CLI synthesis, and GUI launch.Assets & docs
docs/devpruned to adopted design mocks (design-<page>.html, one per page); CONTRIBUTING added;core/realtimeremoved.Test plan
videocaptioner/cli/test_cli+test_dubbing: 110)uv build+ wheel install smoke (CLI + windowed GUI) + PyInstaller bundle smoke🤖 Generated with Claude Code