Description
I've been using ffmpeg-normalize (EBU R128 method) to normalize the audio of gameplay recordings. Typically the recordings have a peak and LUFS significantly lower than the target volume, and I use ffmpeg-normalize to boost the volume. Sometimes there's silence in the audio, like when the game is loading or paused.
When there are at least 2-3 seconds of silence at the beginning of the audio track, the result I get with ffmpeg-normalize has a lower-than-expected volume right after the silence, and then the volume gradually climbs toward the expected volume over a period of time.
Here's an example. Waveform of original recording:
Zooming in on the original recording, to confirm that the volume is reasonably steady:
Normalization result, using ffmpeg-normalize.exe original.aac -nt ebu -t -14 -c:a aac -o normalized.aac
- it takes roughly 90 seconds to climb to the volume I'd expect from normalization:
If I trim most of the silence off the start, and then normalize, the volume seems to be fine throughout the track. Using ffmpeg -ss 11 -i original.aac -copyts trim_11.aac
and ffmpeg-normalize.exe trim_11.aac -nt ebu -t -14 -c:a aac -o trim_11_normalized.aac
:
Windows 10, Python 3.8, ffmpeg 4.3.2. I'm happy to provide audio uploads, stats, more details/examples, etc. but I thought I'd check first - am I missing something obvious? Is this expected behavior, or am I missing a tuning parameter that would help?