Per-segment audio effects DSP preset selector (closes #67, rebased from #68) by debpalash · Pull Request #109 · debpalash/OmniVoice-Studio

debpalash · 2026-05-20T07:59:10Z

Rebases PR #68 by @4shil onto current main and resolves merge conflicts in backend/api/routers/{dub_generate,generation}.py (Phase 2's audio_io helpers + PR #68's audio_dsp chain helpers were both added to the same import lines).

All original work is @4shil's. This PR exists because PR #68's branch lives on a fork I can't force-push to.

Summary (original)

Adds a per-segment audio effects DSP preset selector to the dub pipeline. Users can choose from 6 broadcast-grade DSP presets (broadcast, cinematic, podcast, warm, bright, raw) that apply to TTS-generated audio before mixing.

Changes (original)

New GET /engines/effects/presets endpoint returning the 6 preset definitions
Per-segment effect_preset parameter on the dub generation pipeline (_gen closure in dub_generate.py)
raw preset bypasses mastering entirely (returns the model output unchanged)
All other presets layer effect chains on top of the existing apply_mastering + normalize_audio step
tests/test_effects_chain.py — 15 unit tests covering preset surface, chain composition, and audio invariants

Conflict resolution

Both dub_generate.py and generation.py had identical-shaped conflicts: PR #68 added DSP helper imports while Phase 2 (PR #96) added audio_io safe-helper imports on the adjacent line. Resolved both as a union — keep both sets of imports.

Test plan

pytest tests/test_effects_chain.py — 15/15 pass post-rebase
pytest tests/smoke/ — green
Manual: open dub job, switch preset selector, verify audible difference between broadcast / cinematic / raw

🤖 Rebased with Claude Code — original PR #68 should be closed once this merges

Summary by CodeRabbit

Release Notes

New Features
- Audio effect presets now available for generated audio with multiple preset options to choose from
- New API endpoint to list available effect presets and their metadata
- Per-segment effect preset selection and configuration capability
- Effect presets consistently applied during both initial and retry generation attempts
Tests
- Added comprehensive test suite for effect presets validation and audio processing pipeline

…duplicate sys.path

…effect presets

# Conflicts: # backend/api/routers/dub_generate.py # backend/api/routers/generation.py

coderabbitai · 2026-05-20T07:59:22Z

📝 Walkthrough

Walkthrough

This PR adds per-segment audio DSP effect presets to the generation pipelines. Segments can now select from available presets (defaulting to "broadcast"); the choice is applied during TTS synthesis via mastering, effect chain lookup, and normalization, with a "raw" bypass option. Effect preset changes trigger segment regeneration via updated fingerprints. Backend, frontend types, and state management are extended to support the feature, with comprehensive test coverage.

Changes

Effect Preset DSP Pipeline

Layer / File(s)	Summary
Data contracts and effect preset validation `backend/api/schemas.py`, `backend/schemas/requests.py`	`EffectPresetEntry` and `EffectPresetsResponse` schemas are defined for the presets API; `DubSegment` gains a validated `effect_preset` field (default "broadcast") with a validator that enforces membership in the available preset registry.
Effect presets listing endpoint `backend/api/routers/engines.py`	`GET /engines/effects/presets` endpoint is added, importing `list_effect_presets` and returning the available presets via `EffectPresetsResponse`.
Generation endpoint effect preset integration `backend/api/routers/generation.py`	`/generate` endpoint gains `effect_preset` form parameter (default "broadcast"); `_run_inference` accepts the preset, validates it, computes sample rate, bypasses DSP for "raw", otherwise applies mastering → effect chain → normalization, then passes the preset through to background execution.
Dub generation segment DSP pipeline `backend/api/routers/dub_generate.py`	`_gen` worker accepts `effect_preset`; both first-attempt and OOM-retry generation paths apply preset-driven DSP (mastering, effect chain lookup, normalization) with "raw" bypass; segment loop derives `seg_effect_preset` and passes it to `_gen`; segment fingerprint is extended to include `effect_preset` for change tracking.
Batched TTS effect preset support `backend/services/batched_tts.py`	`SegmentSpec` data container gains `effect_preset` slot; per-segment processing defaults preset to "broadcast", bypasses DSP for "raw", otherwise applies mastering → effect chain → normalization.
Incremental rebubbing fingerprint update `backend/services/incremental.py`	`effect_preset` is added to `_GEN_INPUT_FIELDS` so preset changes trigger segment regeneration; `segment_fingerprint` docstring is updated to document the new hash input.
Frontend API contracts and client `frontend/src/api/engines.ts`, `frontend/src/api/types.ts`	TypeScript `EffectPreset` and `EffectPresetsResponse` types are defined; `fetchEffectPresets()` async function calls `/engines/effects/presets`; new `DubSegment` type includes `effect_preset` as optional string field.
Frontend effect preset state in dub slice `frontend/src/store/dubSlice.ts`	`DubSlice` Zustand store is extended with `segmentEffectPresets` (per-segment preset map) and `availableEffectPresets` (loaded presets list); typed setters `setSegmentEffectPreset` and `setAvailableEffectPresets` are implemented; state is initialized with empty defaults.
Effects chain DSP unit test suite `tests/test_effects_chain.py`	Comprehensive pytest module validates preset listing, chain retrieval for known/raw/unknown preset IDs, audio shape preservation, smoke coverage across all presets, clipping bounds for limiter-based presets, and output differentiation between raw and processed audio.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~22 minutes

Possibly related issues

debpalash/OmniVoice-Studio#67: This PR directly implements the per-segment effect_preset feature specified in the issue, including field additions to DubSegment, DSP chain integration in multiple generation paths, and test coverage.

Possibly related PRs

debpalash/OmniVoice-Studio#75: Both PRs modify the _gen worker and OOM-retry logic in backend/api/routers/dub_generate.py; this PR adds effect_preset DSP processing to both first-attempt and retry paths, while the other PR modifies OOM detection and retry behavior, creating a direct coupling.

Poem

🎧 A rabbit hops through audio streams,
Effect presets in generation dreams,
From broadcast glow to raw and clean,
Mastering chains in between,
Each segment sings with style supreme! 🐰✨

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 29.03% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The PR title clearly and concisely summarizes the main change: adding a per-segment audio effects DSP preset selector to the dub pipeline, with appropriate issue references.
Description check	✅ Passed	The PR description is comprehensive and well-structured. It includes a clear summary of the feature, detailed changes, conflict resolution explanation, test plan with results, and proper attribution to the original author.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch pr-68-fresh

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)

backend/services/incremental.py (1)
23-38: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Hash the canonical preset value, not the raw field.

The generation paths treat a missing effect_preset as "broadcast", but segment_fingerprint() hashes None as "". That makes omitted and explicit default presets produce identical audio while getting different fingerprints, so incremental regen will miss cache hits and regenerate unchanged segments.
💡 Suggested fix
 _GEN_INPUT_FIELDS = ("text", "target_lang", "profile_id", "instruct", "speed", "direction", "effect_preset")
@@
 def segment_fingerprint(seg: dict) -> str:
@@
-    payload = {k: (seg.get(k) if seg.get(k) is not None else "") for k in _GEN_INPUT_FIELDS}
+    payload = {}
+    for k in _GEN_INPUT_FIELDS:
+        if k == "effect_preset":
+            payload[k] = seg.get(k) or "broadcast"
+        else:
+            payload[k] = seg.get(k) if seg.get(k) is not None else ""
     blob = json.dumps(payload, sort_keys=True, ensure_ascii=False)
     return hashlib.sha1(blob.encode("utf-8")).hexdigest()[:16]
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@backend/services/incremental.py` around lines 23 - 38, segment_fingerprint is
hashing raw seg fields (from _GEN_INPUT_FIELDS) so a missing effect_preset
(None) gets turned into "" and produces a different fingerprint than an explicit
"broadcast" default; change segment_fingerprint to canonicalize the
effect_preset before hashing (i.e., when k == "effect_preset" map None/empty to
the generation-default "broadcast" or call the same canonicalization used in the
generation path) so omitted and explicit default presets yield the same hash.
backend/api/routers/dub_generate.py (1)
111-179: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Preview audio will drift from exported audio for non-default presets.

This wires effect_preset through /dub/generate, but preview_segment() in the same file still renders with the legacy mastering-only path. A segment set to cinematic, podcast, warm, bright, or raw can sound different in preview than in the final dub. Thread the preset through the preview request and reuse the same DSP branch there.

Also applies to: 268-272
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@backend/api/routers/dub_generate.py` around lines 111 - 179, The preview path
still uses the legacy mastering-only flow, so preview_segment() must accept and
use the effect_preset and apply the exact DSP branch used in _gen: thread
effect_preset into preview_segment (and any callers in this file), compute
seg_effect_preset = effect_preset or "broadcast", return raw if
seg_effect_preset == "raw", otherwise run apply_mastering -> get_effect_chain ->
apply_effects_chain -> normalize_audio (same parameters/target_dBFS as _gen),
and preserve the existing fallback/default behaviors for missing presets or None
effect_preset.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@backend/api/routers/generation.py`:
- Around line 51-75: The post-processing (apply_mastering, get_effect_chain,
apply_effects_chain, normalize_audio) is currently executed inside the broad OOM
translation/except block so DSP bugs surface as fake "out of memory" errors;
refactor so only the call to model.generate(...) is wrapped by the OOM-specific
try/except and re-raise/translate OOMs there, then perform effect_preset
validation and all post-processing (apply_mastering, get_effect_chain,
apply_effects_chain, normalize_audio) after that try/except so their real
exceptions propagate normally.

---

Outside diff comments:
In `@backend/api/routers/dub_generate.py`:
- Around line 111-179: The preview path still uses the legacy mastering-only
flow, so preview_segment() must accept and use the effect_preset and apply the
exact DSP branch used in _gen: thread effect_preset into preview_segment (and
any callers in this file), compute seg_effect_preset = effect_preset or
"broadcast", return raw if seg_effect_preset == "raw", otherwise run
apply_mastering -> get_effect_chain -> apply_effects_chain -> normalize_audio
(same parameters/target_dBFS as _gen), and preserve the existing
fallback/default behaviors for missing presets or None effect_preset.

In `@backend/services/incremental.py`:
- Around line 23-38: segment_fingerprint is hashing raw seg fields (from
_GEN_INPUT_FIELDS) so a missing effect_preset (None) gets turned into "" and
produces a different fingerprint than an explicit "broadcast" default; change
segment_fingerprint to canonicalize the effect_preset before hashing (i.e., when
k == "effect_preset" map None/empty to the generation-default "broadcast" or
call the same canonicalization used in the generation path) so omitted and
explicit default presets yield the same hash.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: c1bcbe1a-1aa0-445d-97c1-b7b790debc62

📥 Commits

Reviewing files that changed from the base of the PR and between 424ad76 and 4be4095.

📒 Files selected for processing (11)

backend/api/routers/dub_generate.py
backend/api/routers/engines.py
backend/api/routers/generation.py
backend/api/schemas.py
backend/schemas/requests.py
backend/services/batched_tts.py
backend/services/incremental.py
frontend/src/api/engines.ts
frontend/src/api/types.ts
frontend/src/store/dubSlice.ts
tests/test_effects_chain.py

coderabbitai · 2026-05-20T08:03:52Z

+        sr = model.sampling_rate if hasattr(model, 'sampling_rate') else 24000
+
+        # Apply DSP effect preset
+        _effect_preset = effect_preset or "broadcast"
+
+        # Validate preset ID
+        from services.audio_dsp import EFFECT_PRESETS
+        if _effect_preset not in EFFECT_PRESETS:
+            raise ValueError(
+                f"Unknown effect preset: {_effect_preset!r}. "
+                f"Valid: {list(EFFECT_PRESETS.keys())}"
+            )
+
+        if _effect_preset == "raw":
+            # Raw: skip all DSP — return raw model output
+            return audio_out
+
+        mastered_audio = apply_mastering(audio_out, sample_rate=sr)
+        _chain = get_effect_chain(_effect_preset)
+        if _chain:
+            mastered_audio = apply_effects_chain(
+                mastered_audio, sample_rate=sr, chain=_chain,
+            )
+
        return normalize_audio(mastered_audio, target_dBFS=-2.0)


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Keep DSP failures out of the OOM wrapper.

These new post-processing calls still run inside the broad except Exception below, so a bug in the effect chain will now come back as a bogus “out of memory” error. Scope the OOM translation to model.generate(...) only, and let mastering/effects/normalize failures propagate with their real cause.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@backend/api/routers/generation.py` around lines 51 - 75, The post-processing (apply_mastering, get_effect_chain, apply_effects_chain, normalize_audio) is currently executed inside the broad OOM translation/except block so DSP bugs surface as fake "out of memory" errors; refactor so only the call to model.generate(...) is wrapped by the OOM-specific try/except and re-raise/translate OOMs there, then perform effect_preset validation and all post-processing (apply_mastering, get_effect_chain, apply_effects_chain, normalize_audio) after that try/except so their real exceptions propagate normally.

4shil and others added 9 commits May 16, 2026 14:57

Add per-segment audio effects DSP preset selector to dub pipeline

472d1e5

Add shape assertions to podcast, warm, and bright preset tests

0c257fe

Fix raw preset semantics, add preset validation, update docs, remove …

fd7b908

…duplicate sys.path

Narrow OOM catch to model.generate only in dub_generate

943c811

Preserve original OOM exception context in dub_generate

6b325b2

Bind effect_preset to _gen via explicit parameter to avoid loop capture

924f842

Catch RuntimeError instead of torch.mps.MPSError for MPS OOM

9ea6879

Merge upstream/main — resolve conflict in dub_generate, preserve DSP …

027f9b2

…effect presets

Merge branch 'main' into pr-68-fresh

4be4095

# Conflicts: # backend/api/routers/dub_generate.py # backend/api/routers/generation.py

coderabbitai Bot reviewed May 20, 2026

View reviewed changes

debpalash merged commit 1edd35c into main May 20, 2026
8 checks passed

debpalash deleted the pr-68-fresh branch May 20, 2026 08:10

debpalash mentioned this pull request May 20, 2026

Add per-segment audio effects DSP preset selector to dub pipeline #68

Closed

4 tasks

This was referenced Jun 11, 2026

fix(tts): /generate honors the selected TTS engine (#312) #324

Merged

fix(dub): re-dub honors transcript edits — fingerprints canonicalised, preview cache-busted, atomic mux (#281) #329

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Per-segment audio effects DSP preset selector (closes #67, rebased from #68)#109

Per-segment audio effects DSP preset selector (closes #67, rebased from #68)#109
debpalash merged 9 commits into
mainfrom
pr-68-fresh

debpalash commented May 20, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented May 20, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Possibly related issues

Possibly related PRs

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot May 20, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

debpalash commented May 20, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary (original)

Changes (original)

Conflict resolution

Test plan

Summary by CodeRabbit

Release Notes

Uh oh!

coderabbitai Bot commented May 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related issues

Possibly related PRs

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 20, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

debpalash commented May 20, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 20, 2026 •

edited

Loading