If you would like to use the audio augmentations, please install AugLy using the following command:
pip install augly[av]This ensures that not only the base dependencies, but also the heavier dependencies required for audio & video processing, are installed.
If you have CUDA, you can also install AugLy with builtin GPU-accelerated audio augmentation support.
Check if your system has CUDA by running:
nvcc --versionBased on the compiler version, you can manually install cupy:
pip install cupy-cuda<version>Note: Remove any periods from the version number.
AugLy will now automatically detect if cupy is available and use it for
GPU-accelerated audio augmentation.
Try running some AugLy audio augmentations in Colab! For a full list of available augmentations, see here.
Our audio augmentations use librosa, torchaudio, and NumPy as their backend. All functions accept an audio path or an audio array to be augmented as input and return the augmented audio array. If an output path is specified, the audio will also be saved to a file.
You can call the functional augmentations like so:
import augly.audio as audaugs
audio_path = "your_audio_path.flac"
output_path = "your_output_path.flac"
# Augmentation functions can accept audio paths as input and
# always return the resulting augmented audio array & sample rate
aug_audio, sample_rate = audaugs.change_volume(audio_path, volume_db=10.0)
# Augmentation functions can also accept np.ndarray as input
# (but then you have to provide the sample rate, too, since it
# won't be inferred when loading the audio file)
aug_audio, sample_rate = audaugs.low_pass_filter(
aug_audio,
sample_rate=sample_rate,
cutoff_hz=500,
)
# If an output path is specified, the audio will also be saved to a file
aug_audio, sample_rate = audaugs.normalize(
aug_audio,
sample_rate=sample_rate,
output_path=output_path,
)You can also call any augmentation as a Transform class, including composing them together and applying them with a given probability:
TRANSFORMS = audaugs.Compose([
audaugs.Clip(duration_factor=0.25),
audaugs.ChangeVolume(volume_db=10.0, p=0.5),
audaugs.OneOf(
[audaugs.Speed(factor=3.0), audaugs.TimeStretch(rate=3.0)]
),
])
# aug_audio is a NumPy array with your augmentations applied!
audio_array = librosa.load("your_audio_path.flac", sr=None, mono=False)
aug_audio = TRANSFORMS(audio_array)You can run our audio unit tests if you have cloned augly (see here) by running the following:
python -m unittest discover -s augly/tests/audio_tests/ -p "*"