Skip to content

模型端点配置化#4

Open
Nuyoahzhou wants to merge 1 commit into
alibaba:mainfrom
Nuyoahzhou:codex/model-endpoint-config-refactor
Open

模型端点配置化#4
Nuyoahzhou wants to merge 1 commit into
alibaba:mainfrom
Nuyoahzhou:codex/model-endpoint-config-refactor

Conversation

@Nuyoahzhou
Copy link
Copy Markdown

将代码中的硬编码的api model_name 等抽到env 中 以便后续方便的部署安装。以及使用海外端点。

@CLAassistant
Copy link
Copy Markdown

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
You have signed the CLA already but the status is still pending? Let us recheck it.

Star-Lotus pushed a commit that referenced this pull request May 28, 2026
…erride for custom voices

Custom cloned voices have voice_ids that aren't in TTS_VOICE_REGISTRY, so
the existing _resolve_model_for_voice() fallback would dispatch wrong model
and produce wrong output. Wire explicit overrides through both backend
endpoints and TTSProcessor.

src/audio/tts.py:
- TTSProcessor.synthesize() gains model_override + family_override params
- _synthesize_cosyvoice() honors model_override
- _synthesize_qwen3() honors model_override
- Backward compatible: when overrides are None, falls back to registry lookup

src/apps/comic_gen/api.py:
- /voice/preview now looks up pipeline.find_custom_voice(voice_id) to
  resolve target_model + family for custom voices, passes through as
  overrides to synthesize()
- New POST /voice/clone {series_id, audio_url, label, target_model}
  - Delegates to pipeline.create_voice_clone
  - target_model defaults cosyvoice-v3.5-plus (supports instructions →
    enables PR-3j emotion control on clones too)
  - 404 if series not found, 500 on dashscope error
- New GET /series/{id}/custom_voices
  - Returns full list (clones + designs) for the Voice picker's
    我的复刻 / 我的设计 tabs to render
- New DELETE /series/{id}/custom_voices/{voice_id}
  - Removes from list (does NOT call dashscope delete — 24h retention)

Smoke-tested: unknown voice_id falls through to self.model as expected
(registry lookup graceful). model_override path wires correctly through
both family branches.

Next: #3 frontend api.ts wrappers + #4 VoiceCloneModal sub-modal +
VoicePickerModal Tab 2 integration.
Star-Lotus pushed a commit that referenced this pull request May 28, 2026
Voice cloning now fully functional from Cast picker UI.

frontend/src/lib/api.ts:
- New CustomVoice type (mirrors backend models.py CustomVoice)
- cloneVoice({series_id, audio_url, label, target_model?}) → CustomVoice
- listCustomVoices(seriesId) → CustomVoice[]
- deleteCustomVoice(seriesId, voiceId) → {removed}

New frontend/src/components/modules/cast/VoiceCloneModal.tsx (~250 LOC):
- Per Q16.2 推荐 B: dedicated sub-modal opened from VoicePickerModal
  '+ 上传新音频' button. Keeps picker focused on selection; creation
  has its own UX (drop zone + label + submit)
- Phases: pick → uploading → cloning → done (with status banners) /
  error (with retry path via cancel + reopen)
- Frontend pre-validation: MP3/WAV/M4A, ≤10MB; backend hard-rejects
  otherwise (dashscope side enforces audio quality)
- Drop zone: drag-and-drop + click-to-pick; auto-suggests label from
  filename (minus extension, ≤30 chars)
- Two-step flow: POST /upload → URL → POST /voice/clone {url, label}
- Block-close during in-flight phases to prevent orphaned uploads

VoicePickerModal extensions:
- New seriesId? prop (when null/missing, Tab 2/3 show "需要先关联到系列"
  placeholder — orphan project handling per spec)
- Loads custom_voices on open via api.listCustomVoices(seriesId)
  in parallel with /voices fetch (Promise.all)
- handlePreviewById extracted from handlePreview; works for both
  VoiceMeta (system) and CustomVoice (clones) — unified preview path
- New CustomVoiceList sub-component: "+ 上传新音频" button at top +
  grid of cards (label + origin tag + target_model + ▶ + 🗑) +
  empty state when no clones yet
- Tab 2 (我的复刻) now renders CustomVoiceList when seriesId; else
  NeedsSeriesPlaceholder
- Apply lookup unified: checks voices ∪ customVoices for selectedId,
  resolves name from VoiceMeta.name or CustomVoice.label fallback
- handleCloneCreated: refreshes list + auto-selects new clone +
  switches to clone tab so user sees their new voice
- handleDeleteCustom: confirms before delete, refreshes, unselects
  if was selected

Cast.tsx: pass currentProject.series_id to VoicePickerModal.

i18n:
- voiceClone namespace (15 keys × 2 langs): title/dropHint/requirements/
  labelLabel/labelPlaceholder/submit/cancel/done/uploading/cloning/
  doneCloned/errorBadType/errorTooLarge
- voicePicker: cloneEmptyTitle (changed from "coming soon" to "no
  cloned voices"), cloneEmptyBody, cloneCreateBtn, cloneNeedsSeries,
  originClone, originDesign, confirmDelete

End-to-end working flow:
1. User hovers character card → clicks voice chip → picker opens
2. Switches to 我的复刻 tab → sees existing clones + "+ 上传新音频"
3. Click + → VoiceCloneModal opens → drop audio + label → submit
4. Wait ~10-30s (uploading + cloning banners) → success → close
5. Picker refreshes → new clone selected + visible → preview ▶ to verify
6. Click 应用 → character.voice_id bound to the new clone

PR-3h COMPLETE. Next: PR-3i (voice design with character.description
one-click → voice_prompt) reuses same sub-modal pattern.
Star-Lotus pushed a commit that referenced this pull request May 28, 2026
- VoiceDesignModal: write voice_prompt → preview → tweak → accept loop
- ✨ 由角色描述生成 button calls /voice/design/translate when character.description present
- VoicePickerModal Tab 3 now mirrors Tab 2: list + delete + auto-select on create
- i18n keys: voiceDesign.* + voicePicker.{designCreateBtn,designNeedsSeries}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants