Agc improvements and improve gain control stability by UnknownSuperficialNight · Pull Request #882 · RustAudio/rodio

UnknownSuperficialNight · 2026-05-11T13:19:33Z

This PR focusses mostly on adding stability to AGC through the slowdown_factor and miscellaneous improvements.

I've been experimenting with the AGC to find ways to stabilise it. This is the result.

The compute_slowdown_factor functions as a third control layer that measures proximity to the target gain alongside standard RMS and peak metrics. It acts as a dynamic throttle, adjusting the AGC rate of change based on how close the signal is to the desired level. The slowdown logic activates only when the current gain falls within the combined RMS+peak tolerance window relative to the target. When the input is loud, the tolerance window widens; with quieter signals, it contracts.

Inside this boundary, exponential scaling prevents the harsh jumps and oscillations that occurred with fixed-rate adjustments. As the signal approaches the target, the slowdown increases to reduce the AGC rate of change and produce smoother behaviour. Outside this zone, the AGC uses normal responsiveness, which allows for more rapid correction when needed. The tolerance window is bounded by the combined RMS+peak metric.

By managing these ranges, the system enables faster attack times without flattening audio dynamics. Previously, aggressive speeds would normalise all sounds to a flat line. Now the AGC can accelerate adjustments when far from the target but slows down exponentially as it approaches the goal. This preserves audio depth while maintaining stability: quick reactions when needed, with gradual stabilisation near the final level, preventing gain overshoot and sudden volume spikes that can occur with fixed-rate adjustments.

`update_peak_level` Optimisation

This function was a performance hotspot due to per-sample allocation and branching. Previously, we computed a conditional coefficient for each sample: a fast attack coefficient (0.0) when the sample exceeded the peak, and a slow release coefficient otherwise.

I've replaced this with a branchless implementation that uses a fixed release_coefficient (which is always cached), eliminating the per-sample if branch and allocation.

Before (Slow, Branching + Allocation):

// This was allocating each sample
let coeff = if sample_value > self.peak_level {
    // Fast attack for rising peaks
    0.0
} else {
    // Slow release for falling peaks
    release_coeff
};

Other changes in this PR

CircularBufferRMS now uses sum-of-squares internally and is cleaned up.
Attack and release times are now raw floats instead of coefficients.
Added div_or_fallback helper to safely divide by non-NaN, non-infinite, positive values.
NaN guards added to RMS and peak logic to prevent either from getting corrupted.
Added fast_exp helper using Horner's method for exp(x) approximation in compute_slowdown_factor.

Benchmarks

Benchmarks before:

Timer precision: 20 ns
effects         fastest       │ slowest       │ median        │ mean          │ samples │ iters
├─ agc_enabled  12.38 ms      │ 13.49 ms      │ 12.48 ms      │ 12.54 ms      │ 100     │ 100

Benchmarks after the changes and redesign:

Timer precision: 20 ns
effects         fastest       │ slowest       │ median        │ mean          │ samples │ iters
├─ agc_enabled  9.145 ms      │ 12.98 ms      │ 9.209 ms      │ 9.408 ms      │ 100     │ 100

Concerns

The Libopus decoder can output samples above 1.0, such as 1.1, 1.064, and similar values, for both RMS and peak readings depending on the track. This behaviour is not observed with the FLAC decoder.

These out-of-range samples cause errors downstream, particularly when offsetting the current gain below 1.0 while targeting 1.0. I've added .min(1.0) to ensure the gain never exceeds the cap/limit for RMS and peak.

The root cause is with the Libopus decoder, as far as I can tell, which should not output values above 1.0 in the first place.

This is probably worth investigating: is this behaviour by design in Libopus, or is there something wrong upstream of the effect?

Potential Improvements

Lookahead Buffer: Rodio does not natively support this, but adding a buffer would allow gradual pre-amplitude gain adjustment before a spike/kick occurs.
Dynamic Buffer Size: Adjust size based on sample rate to maintain a consistent ~20ms window (e.g., 2048 for 96kHz). This ensures the buffer remains consistent.
Pseudocode Example:

fn buffer_size(sample_rate: u32) -> usize {
    match sample_rate {
        96_000 => 2048,
        192_000 => 4096,
        _ => 1024,
    }
}

Speech Profile: Adding a profile tune for dedicated speech to AutomaticGainControlSettings might be a good idea.

Video Comparison

Before:

before_normal.mp4

After:

after_normal.mp4

Before near the loudness limit

near_limit_before.mp4

After near the loudness limit

near_limit_after.mp4

Additional notes

This can be tuned back to how it worked originally if users preferred the more normalised sound.
It might even be worth adding a toggle for the slowdown then we can disable it.

…bility - Replace coefficient-based `attack/release` with direct `Duration` types - Reduce `RMS_WINDOW_SIZE` from `8192` to `512` samples to lower latency - Switch RMS calculation from mean-based buffer (`CircularBuffer`) to sum-of-squares approach in `CircularBufferRMS` for accurate root-mean-square values - Introduce `SlowDownState` struct that manages timing and caching: counts samples in 2ms blocks, computes adaptive `slowdown_factor` using `compute_slowdown_factor` and caches the result for reuse - Implement `fast_exp` using Horner's method for efficient exponential approximation of release coefficients (third-order Taylor polynomial) - Add `NaN` handling in RMS calculation to prevent invalid values - Add rate limiting to gain changes: clamp gain change per sample based on dynamic attack/release duration to prevent overshooting - Add new `peak_tracking_window` setting to control peak level smoothing - Tune default timing parameters: 500ms attack, 0.5ms release, 10ms peak tracking window for balanced behaviour

…calculation - Replace hardcoded `1.0` fallback with `self.current_gain` when `RMS` equals `0.0` - Add comment explaining this keeps gain stable or allows gradual decay instead of sudden drops

- Cap peak tracking at 1.0 to handle out-of-bounds decoder samples - Ensure samples from decoders that are not normalised like `libopus` do not track out-of-bounds values

- Cap rms tracking at 1.0 to handle out-of-bounds decoder samples - Ensure samples from decoders that are not normalised like `libopus` do not track out-of-bounds values

- Change `RMS_WINDOW_SIZE` constant from `512` to `1024` - 1024 samples provides ~23ms window at 44.1kHz / ~21ms at 48kHz for stable RMS estimation

UnknownSuperficialNight · 2026-05-11T13:20:59Z

-            release_time: Duration::from_secs(0), // Recommended release time
-            absolute_max_gain: 7.0,               // Recommended max gain
+            target_level: 1.0,                               // Default to original level
+            attack_time: Duration::from_millis(500),         // Recommended attack time


This might be too low I found 500ms or 800ms to be quite nice would like some feedback on this is if possible

Sorry I have no idea what works best for the new algorithm. For speech quiet fast was useful

UnknownSuperficialNight · 2026-05-11T13:24:37Z

    }
 }

 impl<I> Iterator for AutomaticGainControl<I>


I don't know if I implemented the new changes for this part correctly for the new algorithm I think I did, but it would be nice to have someone else check this that is more familiar with.

looks good to me!

Looks good!

UnknownSuperficialNight · 2026-05-11T13:25:16Z

+/// It provides a good balance between speed and accuracy, resulting in
+/// faster benchmark times compared to the standard `exp` function.
+#[inline]
+fn fast_exp(x: Float) -> Float {


Might be worth moving this to math.rs?

go ahead seems like a good addition!

yara-blue

I also like the idea for multiple profiles. Ideally we also give the "current" default a name, maybe "Music" and "Speech"?

yara-blue · 2026-05-11T18:44:27Z

    }
 }

 impl<I> Iterator for AutomaticGainControl<I>


looks good to me!

yara-blue · 2026-05-11T18:47:19Z

-            absolute_max_gain: 7.0,               // Recommended max gain
+            target_level: 1.0,                               // Default to original level
+            attack_time: Duration::from_millis(500),         // Recommended attack time
+            release_time: Duration::from_nanos(500000),      // Recommended release time


I'd use from_micros here :)

yara-blue · 2026-05-11T18:53:54Z

    }
 }

 impl<I> Iterator for AutomaticGainControl<I>


Looks good!

yara-blue · 2026-05-11T18:54:24Z

-            release_time: Duration::from_secs(0), // Recommended release time
-            absolute_max_gain: 7.0,               // Recommended max gain
+            target_level: 1.0,                               // Default to original level
+            attack_time: Duration::from_millis(500),         // Recommended attack time


Sorry I have no idea what works best for the new algorithm. For speech quiet fast was useful

yara-blue · 2026-05-11T18:55:30Z

+/// It provides a good balance between speed and accuracy, resulting in
+/// faster benchmark times compared to the standard `exp` function.
+#[inline]
+fn fast_exp(x: Float) -> Float {


go ahead seems like a good addition!

roderickvd · 2026-05-11T19:15:59Z

The Libopus decoder can output samples above 1.0, such as 1.1, 1.064, and similar values, for both RMS and peak readings depending on the track. This behaviour is not observed with the FLAC decoder.

These out-of-range samples cause errors downstream, particularly when offsetting the current gain below 1.0 while targeting 1.0. I've added .min(1.0) to ensure the gain never exceeds the cap/limit for RMS and peak.

Values outside of -1.0..=1.0 aren't out-of-range for a DSP pipeline. Strange as it may be from libopus itself, the beauty of working in normalized floating point is that it never clips until it's finally converted to integer. A chain of Rodio filters itself could also return values > 1.0 even if the decoder wouldn't.

Long story short, we should deal with such values without clipping them.

UnknownSuperficialNight · 2026-05-12T02:29:23Z

The Libopus decoder can output samples above 1.0, such as 1.1, 1.064, and similar values, for both RMS and peak readings depending on the track. This behaviour is not observed with the FLAC decoder.
These out-of-range samples cause errors downstream, particularly when offsetting the current gain below 1.0 while targeting 1.0. I've added .min(1.0) to ensure the gain never exceeds the cap/limit for RMS and peak.

Values outside of -1.0..=1.0 aren't out-of-range for a DSP pipeline. Strange as it may be from libopus itself, the beauty of working in normalized floating point is that it never clips until it's finally converted to integer. A chain of Rodio filters itself could also return values > 1.0 even if the decoder wouldn't.

Long story short, we should deal with such values without clipping them.

Any ideas on this?

First thing that comes to mind though I could be wrong is something like this self.peak_level.max(1.0); and storing that per sample like this basically

let full_scale = self.peak_level.max(1.0);

// Calculate max gain change per sample based on dynamic attack/release times
let max_attack_gain_change_per_sample = full_scale / (dynamic_attack_time * sample_rate);
let max_release_gain_change_per_sample = full_scale / (release_duration * sample_rate);

Basically go through and compute a new max per sample and scale for that.

Just throwing ideas out there.

Would probably have to go through it all again possibly and remove the 1.0 assumption

yara-blue · 2026-05-12T19:43:54Z

Any ideas on this?

Long story short, we should deal with such values without clipping them.

AGC maps input to the range [-1.0, 1.0]. To do so without clipping it needs the width of the input range. It can't look ahead to see what other samples will be emitted and thus what the peak is. All I can think of is to assume the input range to be unrealistically big, say: [-1.5, 1.5]? Is that unrealistically big?

Basically go through and compute a new max per sample and scale for that.

Lets look at some extreme, what would happen if halfway through playback one single sample peaks really high, lets say 10.0? Would everything get quieter after that sample?

roderickvd · 2026-05-12T20:25:29Z

Please excuse me responding a bit theoretically without recent study of the current implementation:

The gain calculation should fundamentally be the ratio target / measured, regardless of whether the peak is 0.8, 1.0, or 1.1. No fixed ceiling should be needed. Instead of clipping with .min(1.0) we should track the true peak. If the peak is 1.1, the AGC should respond with a gain below 1.0.

If the root cause is that the code assumes 1.0 as ceiling, then ideally we should remove those assumptions.

UnknownSuperficialNight · 2026-05-13T10:46:50Z

Any ideas on this?

Long story short, we should deal with such values without clipping them.

AGC maps input to the range [-1.0, 1.0]. To do so without clipping it needs the width of the input range. It can't look ahead to see what other samples will be emitted and thus what the peak is. All I can think of is to assume the input range to be unrealistically big, say: [-1.5, 1.5]? Is that unrealistically big?

I was thinking about a running maximum where we track each sample and if we get a sample that exceeds, we will replace the old running maximum with the new one. That was my original idea anyway.

Basically go through and compute a new max per sample and scale for that.

Lets look at some extreme, what would happen if halfway through playback one single sample peaks really high, lets say 10.0? Would everything get quieter after that sample?

Possibly I guess that would depend on the implementation maybe there is a peak decay after n samples etc…

Though neither of these seem like a proper solution I must admit.

UnknownSuperficialNight · 2026-05-13T10:57:17Z

Please excuse me responding a bit theoretically without recent study of the current implementation:

The gain calculation should fundamentally be the ratio target / measured, regardless of whether the peak is 0.8, 1.0, or 1.1. No fixed ceiling should be needed. Instead of clipping with .min(1.0) we should track the true peak. If the peak is 1.1, the AGC should respond with a gain below 1.0.

This is what happens without the min(1.0)when the RMS or peak go over 1.0 the AGC dips below 1.0 and or spikes of dropping volume.

Though this could be an issue as we could get, for example, dips to 0.7 gain and dropping volume.

My approach was to, by default, limit the gain to 1.0/target so in other words, if it's max or very close to max, the gain should be around source so that the sound is the same as the original. Then, if the input clips, the AGC should clip (stay the same as source gain) and if the input does not clip, the AGC does not clip in other words, default to what the original sound was.

One thing we could do here is remove the min(1.0) and let it fall below 1.0 if calculated RMS or PEAK go above 1.0 but for default, use the limiter to limit gain to 1.0 then this way if people want gain dropping below 1.0 they can set the limiter to 0.0 while by default it stops at source aka 1.0

If the root cause is that the code assumes 1.0 as ceiling, then ideally we should remove those assumptions.

It would be ideal though. However, how would we scale the PEAK and RMS then? There needs to be a scale somewhere. I guess a true peak would be it, but that would only really be possible with a running maximum as we cannot pre-process the file to find a true peak, nor can we look ahead with a look-ahead buffer.

UnknownSuperficialNight added 5 commits May 1, 2026 13:46

fix: Use current_gain as fallback during silence in AGC rms_gain …

5e1a869

…calculation - Replace hardcoded `1.0` fallback with `self.current_gain` when `RMS` equals `0.0` - Add comment explaining this keeps gain stable or allows gradual decay instead of sudden drops

fix: clamp peak_level to 1.0 to prevent decoder artifacts

233ba47

- Cap peak tracking at 1.0 to handle out-of-bounds decoder samples - Ensure samples from decoders that are not normalised like `libopus` do not track out-of-bounds values

fix: clamp rms to 1.0 to prevent decoder artifacts

c502095

- Cap rms tracking at 1.0 to handle out-of-bounds decoder samples - Ensure samples from decoders that are not normalised like `libopus` do not track out-of-bounds values

chore: Increase RMS_WINDOW_SIZE to 1024 samples

5b256a1

- Change `RMS_WINDOW_SIZE` constant from `512` to `1024` - 1024 samples provides ~23ms window at 44.1kHz / ~21ms at 48kHz for stable RMS estimation

UnknownSuperficialNight commented May 11, 2026

View reviewed changes

yara-blue reviewed May 11, 2026

View reviewed changes

Conversation

UnknownSuperficialNight commented May 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

update_peak_level Optimisation

Other changes in this PR

Benchmarks

Concerns

Potential Improvements

Video Comparison

Additional notes

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

UnknownSuperficialNight May 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

yara-blue left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

roderickvd commented May 11, 2026

Uh oh!

UnknownSuperficialNight commented May 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

yara-blue commented May 12, 2026

Uh oh!

roderickvd commented May 12, 2026

Uh oh!

UnknownSuperficialNight commented May 13, 2026

Uh oh!

UnknownSuperficialNight commented May 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

UnknownSuperficialNight commented May 11, 2026 •

edited

Loading

`update_peak_level` Optimisation

UnknownSuperficialNight May 11, 2026 •

edited

Loading

UnknownSuperficialNight commented May 12, 2026 •

edited

Loading

UnknownSuperficialNight commented May 13, 2026 •

edited

Loading