AudioBridge audio limiter by m08pvv · Pull Request #3593 · meetecho/janus-gateway

m08pvv · 2025-10-31T07:42:16Z

Worked a bit with AudioBridge and it was painful to hear all crackling noises when multiple streams summed up, so I decided to implement an audio limiter. I took AGC2 from Google's WebRTC (BSD 3-clause license that allows modifications and usage as long as license file stays in repo) and translated it to C (didn't write in C since school, so it might be not clean enough), adapted it for AudioBridge and also added SSE 4.2 and AVX2 versions (with proper initialization from available at runtime).

Also added a parameter to AudioBridge: use_limiter (boolean, default=false), which enables limiter. If limiter is disabled - only clamping (saturation to int16 boundaries) applied.

Here are some results of many participants speaking (5 participants speaking and the record is from incoming ab track):

Top - current Janus release with AudioBridge
Bottom - version with limiter and use_limiter = true
Full-spectrum lines - artifacts (crackling noise and clipping)

As you can see, current Janus AB implementation (just sum up all tracks) results in artifacts (values outside int16 range) whereas limiter reduces volume dynamically to normal level.

How it works:

On ab initialization we set references to the most efficient (from available) implementation (scalar, sse 4.2, avx2)
For each segment of audio:
- If we mix 2 or more tracks (recording 2+ or talking 3+) - calculate scaling factors
- If we record 2+ tracks mixed - scale and clamp, otherwise just clamp
- For each participant:
- - If there are 2 or more tracks besides this participant - scale and clamp, otherwise just clamp

Usage:
By default limiter is disabled and only clamping is applied.
Set use_limiter property in create audiobridge request to true to enable limiter.

Any suggestions are welcome!

lminiero · 2025-11-13T09:56:00Z

I'm trying to better understand the scope for this effort, and after giving it some thought, I think that this patch may be trying to do the right thing but in the wrong place.

If what we want is to prevent saturation on the mix itself, then the patch in #3601 is probably enough, but what we want a limiter to do is "limit" the contribution of each individual participant, especially noisy ones, which probably means the limiter code should be applied to the audio we decode from participants (so before the mix) rather than on the full mix itself. This would be in line with how the denoiser works, in fact: for denoising too, it's individual participants we denoise before mixing, rather than denoising a mix after the fact, so that we end up with a mix that is as clean as possible. By limiting individual participants, I assume we'd have more or less normalized contrbutions, and in case their sum goes too loud, #3601 kicks in to stay within the -0.9/0.9 limit. This would also allow us to selectively choose which participants should be limited, if any, with a room default that can be used in case no property is explicitly set for participants (again, which is how denoising works too).

Does this make sense? What do you think @m08pvv @whatisor ?

m08pvv · 2025-11-13T11:26:47Z

@lminiero in current master the resulting mix is a sum, so if we have 5 speakers (A, B, C, D, E) then:
recording is A+B+C+D+E
participant A receives B+C+D+E
...
participant E receives A+B+C+D

We assume that all tracks (A, B, C, D, E) do not exceed 100% volume (otherwise the audio is already broken).
The sum of 4 tracks will be in range [0, 400%] of volume and sum of 5 tracks will be in [0, 500%], so we need to reduce resulting tracks at least in 4 and 5 times respectively.
The main idea was that we can calculate one set of scale factors (for the whole mix, in this case it's A+B+C+D+E) and then apply it to each resulting track (A+B+C+D, ..., B+C+D+E), so the audiobridge would not introduce the x4-x5 gain if all participants speaking.

Normalizing every track individually would be N times more CPU-intensive and usually the volume of individual speakers adjusted by volume_gain (at least I thought so).

So, if calculating N set of factors and applying them to each track individually is ok (shouldn't take much CPU, but it's a hot path in audiobridge) - then I can do so, but the sum of tracks should be normalized anyway because otherwise even if we have 5 participants that are in range [0, 50%] - resulting tracks would be [0, 200%] and [0, 250%] for participants and recording respectively.

lminiero · 2025-11-13T11:38:17Z

Yeah I get all that, but my point is that if we have 5 participants whose sum gets to 500%, we already have that other PR that brings the mix down to 90% (instead of cutting to 100% as we did before), since we're using 32-bit samples for mixing so we're not actually losing anything. That should already be equivalent to a gain reduction in each sample. I'm personally more interested in limiting individual participants that may be too loud, as what's most boresome in my experience is not the actual sum of things, but the relative dominance of some speakers over others due to their potentially different individual gains.

lminiero · 2025-11-13T11:42:14Z

Normalizing every track individually would be N times more CPU-intensive and usually the volume of individual speakers adjusted by volume_gain (at least I thought so).

It's true that we have the volume gain for that, but that's manual. I was thinking of something automated, which is what I assumed a limiter could bring to the table. That would pretty much be the same as audio compression (which is what I assume any limiter implements, more or less).

m08pvv · 2025-11-13T11:45:04Z

In Fix overflow mixing #3601 the sum is not scaled to 0.9 it's just clamped at that level, so if we have e.g. a sine wave - we will get almost a square wave (meander), which is not what we want I guess. So then it needs to also scale it somehow before or after sum.

lminiero · 2025-11-13T11:47:32Z

even if we have 5 participants that are in range [0, 50%] - resulting tracks would be [0, 200%] and [0, 250%] for participants and recording respectively.

I think this is where we may have a source of confusion. It's true that when you mix many contributions you can exceed the 100%, but that's exactly why we use an int32_t for the mixing buffer: that is to allow a sum even if it goes beyond the limits, without losing anything. When we then snap those samples back to int16_t, the more basic code in that other PR brings it back to max 90% no matter what it was, which is more or less equivalent to a gain reduction. I guess that a better way to do that would be to figure out much the loudest sample exceeds the mix, and apply the same division to all samples in those 20ms, but it's short enough snippets that I don't think it would matter much.

Anyway, I'll need to find some simple ways to test both patches, possibly using some automated participants in the AudioBridge that provide some talking. Once I do that I'll have a clearer picture of the whole thing.

lminiero · 2025-11-13T11:49:51Z

In Fix overflow mixing #3601 the sum is not scaled to 0.9 it's just clamped at that level, so if we have e.g. a sine wave - we will get almost a square wave (meander), which is not what we want I guess. So then it needs to also scale it somehow before or after sum.

Yep, makes sense, which is what I mentioned in a subsequent comment too. I'll need to make some tests.

m08pvv · 2025-11-13T11:58:15Z

Just for visual representation of clipping:

If we just limit min and max values without actual scaling, we get clipping artifacts.

m08pvv · 2025-11-14T07:38:36Z

@whatisor what rnnoise version do you use? I built from f6662ad41f5bf7bf244967a04e95334c81e5af4c with 55+MB rnnoise_data.tar.gz and I have zero clipping even using default demo's 16kHz audiobridge room.

lminiero · 2025-11-14T10:17:57Z

I still need to prepare a test with a few automated AudioBridge participants for testing purposes, but I was thinking that maybe we could achieve the same kind of result in a "simpler" way, and without the additional dependency:

we mix as we do know (sum to an int32 array)
once we have a sum of all participants, find the "loudest" sample, and see what negative gain needs to be applied to stay within the -0.9/0.9 range (if any: we may be in the range already)
apply that gain to the mix and save it to a different array, that we'll use for all passive participants (non-talkers, recording, forwarders)
for participants that are talking, as we do now, remove their contribution and do the same thing (find the loudest sample, find the gain, apply the gain to the mix for that participant)

This way we don't need to worry about how many people are talking, what's silence or not, since we always get a mix that is loud enough for everyone involved, whether they're active or not. Most importantly, we don't needlessly bring the whole mix down any time more than one person talks, since (if I understood our patch correctly) as your patch reduces the volume of active participants depending on how many there are, with 4 people you decrease each of their volumes to ~25%, which will get the mix for non-active participants to ~100%, but active speakers will get a much quieter mix (75%, since their contribution will not be a part of that). Besides, silence detection as it's done is tricky, as simple background noise will not be detected as silence but will be low enough not to contribute anything meaningful and still bring the whole volume down (people talking will suddenly be quieter if someone's unmuted but not really saying anything).

Anyway, this was just me thinking of a possible alternative solution to a problem that does exist, but that's not to say that it will be THE solution. I'll need to setup some testbed to make some tests (with different fake people speaking, as me alone wouldn't cut it), and once I have it I can prepare a tentative patch that implements my idea and I can do some comparisons.

…#3601)

lminiero · 2025-11-14T12:16:53Z

@m08pvv @whatisor I prepared a first version of my proposed idea in the PR above. I did a few tests using a bunch of fake participants sending pre-recorded talking audio (plus my ugly and noisy voice) and it seems to be doing something: if I look at the recording, it seems to be doing its job, but of course I may be wrong. Considering both of you have made some tests with multiple participants, could you let me know if this more or less does what you need it to? Thanks!

m08pvv · 2025-11-19T09:02:01Z

This (adapted from WebRTC) implementation is an AGC (Adaptive Gain Control), so it uses piecewise linear interpolation with pre-calculated gain curves, providing precise control over different signal levels and it includes attack/decay filtering that prevents abrupt gain changes, avoiding audible artifacts. Simple peak scaling can introduce pumping or breathing effects.
This limiter calculates per-sample scaling factors for smooth transitions, especially important during attack phases.
Yes it uses sophisticated envelope detection with lookahead to anticipate signal peaks, but it allows for more graceful gain reduction compared to reactive peak detection. It also divides audio into sub-frames for more granular control, enabling better transient preservation and reduced distortion.
The gain curves are specifically designed to minimize harmonic distortion while maintaining signal integrity, based on extensive psychoacoustic research (done by WebRTC team).

And yes, this code we already run in production and it sounds great in calls with 15 people.

…#3593 and meetecho#3601)

whatisor · 2025-11-19T13:46:26Z

ce we're using 32-bit samples for

I am using https://github.com/xiph/rnnoise/releases/tag/v0.2

m08pvv · 2025-11-24T12:30:22Z

@whatisor, I tried 0.2 (with ~21 MB model) and it sounds terrible, try latest master of rnnoise (it might fade-in beginnings of phrases but at least it doesn't produce noise)

m08pvv · 2025-12-17T06:37:07Z

@lminiero, did you have a chance to look at the code?

lminiero · 2025-12-17T11:28:02Z

Not yet, sorry... if not in the next few days, I'll definitely make this a priority as soon as I get back to the office after the holidays.

m08pvv · 2026-01-20T06:18:36Z

@lminiero, any luck to look at this PR?

lminiero · 2026-01-20T11:41:21Z

Ouch, I haven't checked it yet, apologies! I'm currently busy updating the MoQ library (new version of the draft is out), I'll 100% have a deep dive right after FOSDEM next week. If I don't feel free to shout at me 😅

lminiero

Apologies for the delay, post-FOSDEM flu knocked me out for a few days. I've done a first quick review of the code, but there's a few things I still don't understand. More details inline.

lminiero · 2026-02-06T10:43:21Z

src/plugins/audiobridge-deps/limiter/limiter.c

@@ -0,0 +1,855 @@
+/* The code in this file contains code adapted from WebRTC project. */


It may be helpful to link to exactly the source file you modified. If it's not a single file because it's spread in multiple ones, then maybe the folder, so that people (me included) have a reference.

I won't review the content of this file as it's ~1000 lines and I honestly don't know what it does.

Added link to the original repo and directory with AGC2 sources (it's distributed across several files)

lminiero · 2026-02-06T10:44:54Z

src/plugins/audiobridge-deps/limiter/limiter.h

@@ -0,0 +1,99 @@
+/* The code in this file contains code adapted from WebRTC project. */


lminiero · 2026-02-06T10:45:19Z

src/plugins/janus_audiobridge.c

 #ifdef HAVE_RNNOISE
 #include <rnnoise.h>
 #endif
+/* We ship our own version of  */


Of what? 😁

It was tough day, somehow missed.
Fixed.

lminiero · 2026-02-06T11:42:30Z

src/plugins/janus_audiobridge.c

+			}
+		}
+		/* If we use limiter we should initialize it if we have more than 2 tracks mixed or we have more than 1 track mixed and recording */
+		if(audiobridge->use_limiter && (mix_count > 2 || (audiobridge->recording && mix_count > 1)))


Why more than two? Don't you risk clipping when there's just two very noise speakers as well? I understand that the N-1 algorithm will get both speakers a "clean" mix (just the other one), but passive speakers will still get a sum of both.

Not sure why the recording makes any difference here. Also, why are RTP forwarders not considered, since just as recordings they are a mix of everyone else? Actually, with groups support, RTP forwarders can get even more complicated than that, since they can be a sub-mix of only selected participants.

Simplified condition, now it's just audiobridge->use_limiter && mix_count > 1

Regarding forwarders - I didn't work with them and I need a deeper dive into the code, will come back later.

Simplified initialization logic a bit more and also replaced buffer with fullmix_buffer for rorwarders

@lminiero can you please take a look with a fresh view if I correctly passed fullmix_buffer to rtp-forwarders?

lminiero · 2026-02-06T11:43:51Z

src/plugins/janus_audiobridge.c

+				scale_buffer(buffer, samples, per_sample_scaling_factors, fullmix_buffer);
+			} else {
+				/* Otherwise just clamp values that are outside int16 boundaries */
+				clamp_buffer(buffer, samples, fullmix_buffer);


Why is clamping done only when there's silent participants? Isn't clamping what we always do now?

It's initialization of fullmix_buffer which is later used for all silent participants and contains a mix of all tracks.

…ed to make audio limiter

m08pvv · 2026-03-03T12:29:05Z

Is there anything else I can do to help to complete this PR?

lminiero · 2026-03-04T09:07:15Z

Is there anything else I can do to help to complete this PR?

No need at the moment, thanks! The ball is in my court, so it's on me to find time to review this. I'll probably only be able to do that towards the end of the month, as we have the IETF meeting next week.

Apologies if this is taking a while, but this is a big and heavy change on the AudioBridge fundamentals. Considering we use the AB heavily in production ourselves, I can't risk merging something I'm not very familiar with, or I'm not on board 100%. I'll need time to properly and thoroughly review everything, as there were things I had doubts about.

val.v.petrov added 30 commits October 21, 2025 14:53

Add an audio limiter based on AGC2 from webrtc

324b618

fix clamping

f42cf45

include math.h

82b0cd3

added missing commas -_-'

f750159

Merge remote-tracking branch 'origin/master' into ab_limiter

e9bcc0a

fixing typos

7d76913

remove unused variable

5af1ec8

Added WebRTC license text (not sure if the place is correct)

aa0f276

Move LICENSE.webrtc next to audiobridge code

11c45d9

Adjust copyright comment for the code adapted from WebRTC project

1a0d93e

Merge remote-tracking branch 'origin/master' into ab_limiter

901a2a8

AST-35514

1ab2b5b

extract initialization into a separate function (force inline)

21603ca

extract some parts into a separate files, vectorize (AVX2, SSE4.2)

e75e9b0

fix compilation

2247042

Added info log about used version of limiter (AVX2/SSE4.2 or scalar)

5bef9c0

Adjust cpuid checks

7006830

Update limiter.h and limiter.c

f151694

use scale_buffer and clamp_buffer

bf81fe2

fix declarations

f531e0e

fix declarations

8f9dc93

Cleanup

c91af23

cleanup

89306ce

use int

e5e6f2c

add some hints for compiler

b621fc4

fixing compilation

2cc2e4c

trying to avoid a false UB warning in some builds

6278ec3

Declare envelope fixed size

f225554

Fixed sizes where possible

04a2013

avoid subtraction to not confuse gcc

9ebeac2

fix indentation

07442c9

lminiero added a commit that referenced this pull request Nov 14, 2025

Smoothen AudioBridge mix instead of clamping/truncating (see #3593 and …

ab93ea0

…#3601)

lminiero mentioned this pull request Nov 14, 2025

Smoothen AudioBridge mix instead of clamping/truncating (see #3593 and #3601) #3610

Open

spscream pushed a commit to spscream/janus-gateway that referenced this pull request Nov 19, 2025

Smoothen AudioBridge mix instead of clamping/truncating (see meetecho…

ba04656

…#3593 and meetecho#3601)

lminiero reviewed Feb 6, 2026

View reviewed changes

val.v.petrov added 5 commits February 11, 2026 10:17

Added links to repository and directory with C++ sources that were us…

bb8f45b

…ed to make audio limiter

Merge remote-tracking branch 'origin/master' into ab_limiter

9883f20

Fix condition for limiter initialization

68c0a57

Simplified condition

569a388

Simplify initialization logic and use fullmix_buffer where needed

0e4f4a0

		@@ -0,0 +1,855 @@
		/* The code in this file contains code adapted from WebRTC project. */

		@@ -0,0 +1,99 @@
		/* The code in this file contains code adapted from WebRTC project. */

Conversation

m08pvv commented Oct 31, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

lminiero commented Nov 13, 2025

Uh oh!

m08pvv commented Nov 13, 2025

Uh oh!

lminiero commented Nov 13, 2025

Uh oh!

lminiero commented Nov 13, 2025

Uh oh!

m08pvv commented Nov 13, 2025

Uh oh!

lminiero commented Nov 13, 2025

Uh oh!

lminiero commented Nov 13, 2025

Uh oh!

m08pvv commented Nov 13, 2025

Uh oh!

m08pvv commented Nov 14, 2025

Uh oh!

lminiero commented Nov 14, 2025

Uh oh!

lminiero commented Nov 14, 2025

Uh oh!

m08pvv commented Nov 19, 2025

Uh oh!

whatisor commented Nov 19, 2025

Uh oh!

m08pvv commented Nov 24, 2025

Uh oh!

m08pvv commented Dec 17, 2025

Uh oh!

lminiero commented Dec 17, 2025

Uh oh!

m08pvv commented Jan 20, 2026

Uh oh!

lminiero commented Jan 20, 2026

Uh oh!

lminiero left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

m08pvv commented Mar 3, 2026

Uh oh!

lminiero commented Mar 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

m08pvv commented Oct 31, 2025 •

edited

Loading