Per-segment audio effects DSP preset selector (closes #67, rebased from #68)#109
Conversation
…duplicate sys.path
# Conflicts: # backend/api/routers/dub_generate.py # backend/api/routers/generation.py
📝 WalkthroughWalkthroughThis PR adds per-segment audio DSP effect presets to the generation pipelines. Segments can now select from available presets (defaulting to "broadcast"); the choice is applied during TTS synthesis via mastering, effect chain lookup, and normalization, with a "raw" bypass option. Effect preset changes trigger segment regeneration via updated fingerprints. Backend, frontend types, and state management are extended to support the feature, with comprehensive test coverage. ChangesEffect Preset DSP Pipeline
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~22 minutes Possibly related issues
Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
backend/services/incremental.py (1)
23-38:⚠️ Potential issue | 🟠 Major | ⚡ Quick winHash the canonical preset value, not the raw field.
The generation paths treat a missing
effect_presetas"broadcast", butsegment_fingerprint()hashesNoneas"". That makes omitted and explicit default presets produce identical audio while getting different fingerprints, so incremental regen will miss cache hits and regenerate unchanged segments.💡 Suggested fix
_GEN_INPUT_FIELDS = ("text", "target_lang", "profile_id", "instruct", "speed", "direction", "effect_preset") @@ def segment_fingerprint(seg: dict) -> str: @@ - payload = {k: (seg.get(k) if seg.get(k) is not None else "") for k in _GEN_INPUT_FIELDS} + payload = {} + for k in _GEN_INPUT_FIELDS: + if k == "effect_preset": + payload[k] = seg.get(k) or "broadcast" + else: + payload[k] = seg.get(k) if seg.get(k) is not None else "" blob = json.dumps(payload, sort_keys=True, ensure_ascii=False) return hashlib.sha1(blob.encode("utf-8")).hexdigest()[:16]🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@backend/services/incremental.py` around lines 23 - 38, segment_fingerprint is hashing raw seg fields (from _GEN_INPUT_FIELDS) so a missing effect_preset (None) gets turned into "" and produces a different fingerprint than an explicit "broadcast" default; change segment_fingerprint to canonicalize the effect_preset before hashing (i.e., when k == "effect_preset" map None/empty to the generation-default "broadcast" or call the same canonicalization used in the generation path) so omitted and explicit default presets yield the same hash.backend/api/routers/dub_generate.py (1)
111-179:⚠️ Potential issue | 🟠 Major | ⚡ Quick winPreview audio will drift from exported audio for non-default presets.
This wires
effect_presetthrough/dub/generate, butpreview_segment()in the same file still renders with the legacy mastering-only path. A segment set tocinematic,podcast,warm,bright, orrawcan sound different in preview than in the final dub. Thread the preset through the preview request and reuse the same DSP branch there.Also applies to: 268-272
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@backend/api/routers/dub_generate.py` around lines 111 - 179, The preview path still uses the legacy mastering-only flow, so preview_segment() must accept and use the effect_preset and apply the exact DSP branch used in _gen: thread effect_preset into preview_segment (and any callers in this file), compute seg_effect_preset = effect_preset or "broadcast", return raw if seg_effect_preset == "raw", otherwise run apply_mastering -> get_effect_chain -> apply_effects_chain -> normalize_audio (same parameters/target_dBFS as _gen), and preserve the existing fallback/default behaviors for missing presets or None effect_preset.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@backend/api/routers/generation.py`:
- Around line 51-75: The post-processing (apply_mastering, get_effect_chain,
apply_effects_chain, normalize_audio) is currently executed inside the broad OOM
translation/except block so DSP bugs surface as fake "out of memory" errors;
refactor so only the call to model.generate(...) is wrapped by the OOM-specific
try/except and re-raise/translate OOMs there, then perform effect_preset
validation and all post-processing (apply_mastering, get_effect_chain,
apply_effects_chain, normalize_audio) after that try/except so their real
exceptions propagate normally.
---
Outside diff comments:
In `@backend/api/routers/dub_generate.py`:
- Around line 111-179: The preview path still uses the legacy mastering-only
flow, so preview_segment() must accept and use the effect_preset and apply the
exact DSP branch used in _gen: thread effect_preset into preview_segment (and
any callers in this file), compute seg_effect_preset = effect_preset or
"broadcast", return raw if seg_effect_preset == "raw", otherwise run
apply_mastering -> get_effect_chain -> apply_effects_chain -> normalize_audio
(same parameters/target_dBFS as _gen), and preserve the existing
fallback/default behaviors for missing presets or None effect_preset.
In `@backend/services/incremental.py`:
- Around line 23-38: segment_fingerprint is hashing raw seg fields (from
_GEN_INPUT_FIELDS) so a missing effect_preset (None) gets turned into "" and
produces a different fingerprint than an explicit "broadcast" default; change
segment_fingerprint to canonicalize the effect_preset before hashing (i.e., when
k == "effect_preset" map None/empty to the generation-default "broadcast" or
call the same canonicalization used in the generation path) so omitted and
explicit default presets yield the same hash.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro Plus
Run ID: c1bcbe1a-1aa0-445d-97c1-b7b790debc62
📒 Files selected for processing (11)
backend/api/routers/dub_generate.pybackend/api/routers/engines.pybackend/api/routers/generation.pybackend/api/schemas.pybackend/schemas/requests.pybackend/services/batched_tts.pybackend/services/incremental.pyfrontend/src/api/engines.tsfrontend/src/api/types.tsfrontend/src/store/dubSlice.tstests/test_effects_chain.py
| sr = model.sampling_rate if hasattr(model, 'sampling_rate') else 24000 | ||
|
|
||
| # Apply DSP effect preset | ||
| _effect_preset = effect_preset or "broadcast" | ||
|
|
||
| # Validate preset ID | ||
| from services.audio_dsp import EFFECT_PRESETS | ||
| if _effect_preset not in EFFECT_PRESETS: | ||
| raise ValueError( | ||
| f"Unknown effect preset: {_effect_preset!r}. " | ||
| f"Valid: {list(EFFECT_PRESETS.keys())}" | ||
| ) | ||
|
|
||
| if _effect_preset == "raw": | ||
| # Raw: skip all DSP — return raw model output | ||
| return audio_out | ||
|
|
||
| mastered_audio = apply_mastering(audio_out, sample_rate=sr) | ||
| _chain = get_effect_chain(_effect_preset) | ||
| if _chain: | ||
| mastered_audio = apply_effects_chain( | ||
| mastered_audio, sample_rate=sr, chain=_chain, | ||
| ) | ||
|
|
||
| return normalize_audio(mastered_audio, target_dBFS=-2.0) |
There was a problem hiding this comment.
Keep DSP failures out of the OOM wrapper.
These new post-processing calls still run inside the broad except Exception below, so a bug in the effect chain will now come back as a bogus “out of memory” error. Scope the OOM translation to model.generate(...) only, and let mastering/effects/normalize failures propagate with their real cause.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@backend/api/routers/generation.py` around lines 51 - 75, The post-processing
(apply_mastering, get_effect_chain, apply_effects_chain, normalize_audio) is
currently executed inside the broad OOM translation/except block so DSP bugs
surface as fake "out of memory" errors; refactor so only the call to
model.generate(...) is wrapped by the OOM-specific try/except and
re-raise/translate OOMs there, then perform effect_preset validation and all
post-processing (apply_mastering, get_effect_chain, apply_effects_chain,
normalize_audio) after that try/except so their real exceptions propagate
normally.
Rebases PR #68 by @4shil onto current main and resolves merge conflicts in
backend/api/routers/{dub_generate,generation}.py(Phase 2'saudio_iohelpers + PR #68'saudio_dspchain helpers were both added to the same import lines).All original work is @4shil's. This PR exists because PR #68's branch lives on a fork I can't force-push to.
Summary (original)
Adds a per-segment audio effects DSP preset selector to the dub pipeline. Users can choose from 6 broadcast-grade DSP presets (broadcast, cinematic, podcast, warm, bright, raw) that apply to TTS-generated audio before mixing.
Changes (original)
GET /engines/effects/presetsendpoint returning the 6 preset definitionseffect_presetparameter on the dub generation pipeline (_genclosure indub_generate.py)rawpreset bypasses mastering entirely (returns the model output unchanged)apply_mastering+normalize_audiosteptests/test_effects_chain.py— 15 unit tests covering preset surface, chain composition, and audio invariantsConflict resolution
Both
dub_generate.pyandgeneration.pyhad identical-shaped conflicts: PR #68 added DSP helper imports while Phase 2 (PR #96) addedaudio_iosafe-helper imports on the adjacent line. Resolved both as a union — keep both sets of imports.Test plan
pytest tests/test_effects_chain.py— 15/15 pass post-rebasepytest tests/smoke/— greenbroadcast/cinematic/raw🤖 Rebased with Claude Code — original PR #68 should be closed once this merges
Summary by CodeRabbit
Release Notes
New Features
Tests