Librosa-powered utilities for consistent audio feature extraction, batch processing, and quick visualization. Designed to standardize parameters across datasets while keeping the workflow Pythonic and notebook-friendly.
- Standardized sample rate, frame, and hop settings for reproducible experiments
- Single-file and batch pipelines with optional preemphasis, bandpass filtering, and onset trimming
- Rich feature coverage (spectral, chroma, MFCC, RMS, ZCR, tonnetz, polynomial features)
- Pandas-friendly outputs for downstream ML tasks
- Lightweight matplotlib hooks to inspect extracted features
afe/AudioFeatureExtractor.py: Core wrappers aroundlibrosafeature extractorsafe/BatchExtractor.py: Batch pipeline with metadata indexing and merge/flatten helpersafe/FeatureVisualizer.py: Minimal plots for spectrograms, Mel spectrograms, and chromagramsafe/__init__.py: Module exportsexample.wav: Sample audio for quick smoke testsREADME.md,LICENSE: Documentation and licensing
- Create and activate a virtual environment (recommended):
python -m venv .venv .\.venv\Scripts\activate
- Install runtime dependencies:
pip install numpy pandas librosa matplotlib
from afe import AudioFeatureExtractor
afe = AudioFeatureExtractor(sr=22050, frame_length=1024, hop_ratio=4)
audio = afe.get_audio("example.wav")
mfcc = afe.extract_mfcc(audio, n_mfcc=12)
rms = afe.extract_rms(audio)
stft = afe.extract_stft(audio)BatchExtractor expects an index with a file_name column relative to audio_folder.
import pandas as pd
from afe import BatchExtractor
index = pd.DataFrame({"file_name": ["example.wav"]})
batch = BatchExtractor(
audio_folder=".",
audio_index=index,
preemphasis=True,
pre_coef=0.97,
bp_filter=False,
)
batch.batch_extract_and_merge(
extraction_methods=["melspec", "mfcc", "zcr"],
results_folder="feature_extraction/",
)flat = batch.merge_and_flatten_features(
extraction_methods=["mfcc", "zcr"],
results_folder="feature_extraction/",
label=False,
)
print(flat.head())from afe import FeatureVisualizer
viz = FeatureVisualizer(feature_folder="feature_extraction/")
viz.plot_melspec("example") # expects example_melspec_features.csv
viz.plot_chromagram("example") # expects example_cstft_features.csv
viz.plot_spectrogram("example") # expects example_stft_features.csv| Abbrev | Method |
|---|---|
| stft | AudioFeatureExtractor.extract_stft |
| cqt | AudioFeatureExtractor.extract_cqt |
| cstft | AudioFeatureExtractor.extract_chroma_stft |
| ccqt | AudioFeatureExtractor.extract_chroma_cqt |
| ccens | AudioFeatureExtractor.extract_chroma_cens |
| melspec | AudioFeatureExtractor.extract_melspectrogram |
| mfcc | AudioFeatureExtractor.extract_mfcc |
| rms | AudioFeatureExtractor.extract_rms |
| zcr | AudioFeatureExtractor.extract_zero_crossing_rate |
| centroid | AudioFeatureExtractor.extract_spectral_centroid |
| bandwidth | AudioFeatureExtractor.extract_spectral_bandwidth |
| contrast | AudioFeatureExtractor.extract_spectral_contrast |
| flatness | AudioFeatureExtractor.extract_spectral_flatness |
| rolloff | AudioFeatureExtractor.extract_spectral_rolloff |
| tonnetz | AudioFeatureExtractor.extract_tonnetz |
| poly | AudioFeatureExtractor.extract_poly_features |
preemphasis: First-order differencing to emphasize higher frequenciesbp_filter: Hard bandpass on STFT usingfmin/fmaxtrim: Onset-based trimming of the leading silence
- Tempo-related feature extractors and deltas
- Feature inversion utilities (MFCC/Mel back to audio)
- Additional visualization presets and notebooks
See LICENSE for details. This project is released under a permissive license to encourage reuse and extension.
Maintained by Apurba Kumar Show, IIT Kharagpur.