A Swift package for detecting the tempo (BPM) of audio files using multi-band spectral flux analysis, FFT-based autocorrelation, and harmonic template matching. Built on the Accelerate framework with AVFoundation for audio decoding.
Supports any audio format readable by Core Audio and provides both a high-level async API with progress reporting and a low-level streaming API for real-time ingestion.
import SPFKTempo
let bpm = try await BpmAnalysis(url: audioFileURL).process()
print("Detected: \(bpm)") // "Detected: Bpm(120)"Files shorter than 7.5 seconds (half the default minimumDuration of 15) are automatically looped in-memory to provide enough material for stable detection:
// Explicit minimum duration — loops the file if shorter than half this value
let bpm = try await BpmAnalysis(url: shortFileURL, options: .init(minimumDuration: 20)).process()
// Disable looping
let bpm = try await BpmAnalysis(url: audioFileURL, options: .init(minimumDuration: nil)).process()By default, processing stops as soon as 3 consistent periodic estimates agree (±1 BPM).
Override matchesRequired or tolerance for stricter or looser consensus:
let options = BpmAnalysisOptions(
matchesRequired: 5, // require more agreement before stopping
tolerance: 2.0, // ±2 BPM counts as a match
detection: .init(
quality: .accurate, // 75% window overlap
bpmRange: 60 ... 200 // constrain search range
)
)
let bpm = try await BpmAnalysis(url: audioFileURL, options: options).process()
// Process the entire file without early exit
let options = BpmAnalysisOptions(matchesRequired: nil)
let bpm = try await BpmAnalysis(url: audioFileURL, options: options).process()let analysis = try BpmAnalysis(
url: audioFileURL,
options: .init(detection: .init(quality: .fast))
) { event in
print("Progress: \(event.progress)")
}
let task = Task {
try await analysis.process()
}
// Cancel after timeout
try await Task.sleep(for: .seconds(5))
task.cancel()BpmDetection accepts raw samples directly for real-time or custom decoding pipelines:
let detector = BpmDetection(sampleRate: 48000, options: .init(quality: .balanced))
// Feed mono audio chunks as they arrive
for chunk in audioChunks {
detector.process(chunk)
}
let bpm = detector.estimateTempo() // top result
let candidates = detector.tempoCandidates // ranked alternatives
// Reuse for another signal
detector.reset()| Quality | Overlap | Relative Speed | Best For |
|---|---|---|---|
.fast |
None (1x) | ~4x faster | Strong rhythmic material, long files |
.balanced |
50% (2x) | ~2x faster | General-purpose default |
.accurate |
75% (4x) | Baseline | Complex material, short files |
BpmDetectionOptions controls the algorithm behavior:
| Parameter | Default | Description |
|---|---|---|
quality |
.balanced |
Window overlap level (see table above) |
bpmRange |
40...300 |
Valid tempo range — candidates outside are discarded |
beatsPerBar |
4 |
Beats per bar for comb-filter spacing (3 for waltz, etc.) |
perceptualWeightingAmount |
0.0 |
Mid-tempo bias strength (0.0 = neutral, 1.0 = full bias toward ~130 BPM) |
confidenceLevel |
.moderate |
How aggressively to reject non-rhythmic audio (.disabled, .low, .moderate, .high) |
BpmAnalysisOptions wraps file-level parameters and a nested detection property:
| Parameter | Default | Description |
|---|---|---|
bufferDuration |
1.0 |
Duration in seconds of each analysis chunk |
minimumDuration |
15 |
Files shorter than half this are looped in-memory; nil to disable |
matchesRequired |
3 |
Number of consistent periodic estimates for early exit; nil processes entire file |
tolerance |
1 |
BPM tolerance for consensus matching (±1 BPM accommodates lag quantization jitter) |
detection |
.init() |
Nested BpmDetectionOptions (see table above) |
BpmAnalysis (actor)
|-- Wraps AVAudioFile + BpmDetection for file-level processing
|-- AudioFileScanner feeds audio buffers in chunks
|-- Periodic estimation with CountableResult consensus voting
|-- Early cancellation when matchesRequired is satisfied
|
v
BpmDetection (class)
|-- Streaming sample ingestion via process()
|-- Overlapping analysis frames (blockSize / quality)
|
|-- FourierFilterbank (x3: low, mid, high bands)
| |-- Pre-computed windowed DFT basis (sin/cos tables)
| |-- vDSP matrix-vector multiply for per-band magnitude spectra
|
|-- Spectral Flux
| |-- Log-compressed positive flux per band
| |-- RMS energy envelope as broadband signal
| |-- Moving-average normalization to reduce loudness bias
|
|-- AutocorrelationFFT
| |-- FFT -> |X|^2 -> IFFT autocorrelation
| |-- Unity-normalized per-band, weighted sum across bands
|
|-- ACFCombFilter
| |-- Comb filtering at beat/bar multiples
| |-- Harmonic template scoring with octave-error penalties
| |-- Optional perceptual weighting toward mid-tempo
| |-- Parabolic interpolation for sub-lag peak refinement
|
v
Bpm (value type, typically 40-300 range)
- Input — Audio samples arrive via
process()(streaming) orestimateTempoOfSamples()(batch) - Band decomposition — Three
FourierFilterbankinstances extract magnitude spectra for low (0-550 Hz), mid (550-4000 Hz), and high (4-16 kHz) bands - Onset detection — Positive spectral flux (log-compressed, half-wave rectified) per band, plus a broadband RMS envelope, normalized by a moving average window to reduce loudness bias
- Periodicity analysis — FFT-based autocorrelation, unity-normalized per band, then weighted-summed (low 1.0, mid 0.8, high 0.5, RMS 0.1)
- Candidate scoring — ACF comb filter at beat/bar multiples, blended with harmonic template matching (rewards correct harmonics, penalizes octave errors)
- Peak refinement — Parabolic interpolation for sub-lag precision, then conversion to BPM
- Consensus (BpmAnalysis only) —
CountableResultcollects periodic estimates and triggers early exit when enough agree
Any audio format readable by Core Audio's AVAudioFile, including WAV, AIF, FLAC, M4A, MP4, MP3, AAC, CAF, and OGG.
| Package | Purpose |
|---|---|
| spfk-audio-base | Bpm, AudioFileScanner, CountableResult, URLProgressEvent |
| spfk-testing | Test audio resources (test target only) |
- Platforms: macOS 13+, iOS 16+
- Swift: 6.2+
Spongefork (SPFK) is the personal software projects of Ryan Francesconi. Dedicated to creative sound manipulation, his first application, Spongefork, was released in 1999 for macOS 8. From 2016 to 2025 he was the lead macOS developer at Audio Design Desk.