In-depth technical documentation for audio engineers and developers.
The application uses Tone.js (a Web Audio API wrapper) to implement a professional audio processing chain. The key innovation is using Tone.Offline for faster-than-real-time rendering, allowing users to download processed files without waiting for playback duration.
Input File (WAV/MP3)
↓
Decode to AudioBuffer
↓
Tone.Offline Rendering Context
↓
High-Pass Filter (30Hz)
↓
Multiband Compressor (Dynamic Rumble Control)
↓
Limiter (-0.5dB)
↓
Rendered AudioBuffer
↓
Post-Process Normalization (-1.0dB)
↓
Convert to WAV Blob
↓
Download / Playback
Purpose: Remove imperceptible sub-bass frequencies that waste headroom and cause muddiness.
Implementation:
const highPass = new Tone.Filter({
type: 'highpass',
frequency: 30, // Hz
rolloff: -24, // dB/octave
});Parameters:
- Frequency: 30Hz (below human hearing threshold for most systems)
- Rolloff: -24dB/octave (aggressive rolloff for clean removal)
- Q Factor: Default (Butterworth response)
Effect: Removes frequencies below 30Hz with steep slope, preventing sub-bass from taking up valuable headroom in the final mix.
Purpose: Apply heavy compression only to low frequencies while leaving vocal frequencies transparent.
Implementation:
const multiband = new Tone.MultibandCompressor({
low: {
threshold: -24, // dB
ratio: 4, // 4:1
attack: 0.03, // seconds
release: 0.25, // seconds
},
mid: {
threshold: -6,
ratio: 1.5,
attack: 0.02,
release: 0.15,
},
high: {
threshold: -6,
ratio: 1.5,
attack: 0.02,
release: 0.15,
},
lowFrequency: 60, // Hz
highFrequency: 10000, // Hz
});Band Split Points:
- Low Band: DC to 60Hz (heavy compression)
- Mid Band: 60Hz to 10kHz (light compression for presence)
- High Band: 10kHz to Nyquist (light compression for air)
Low Band Parameters (Critical for rumble control):
- Threshold: -24dB (catches even quiet rumble)
- Ratio: 4:1 (significant reduction)
- Attack: 30ms (fast enough to catch transients)
- Release: 250ms (smooth release to avoid pumping)
Why This Works: This simulates a "dynamic EQ" by compressing only the problematic frequency range when it gets too loud, rather than applying a static EQ cut that would affect all content equally.
Purpose: Catch peaks and prevent clipping after processing.
Implementation:
const limiter = new Tone.Limiter(-0.5); // dBParameters:
- Threshold: -0.5dB FS (0.5dB of headroom)
- Look-Ahead: Automatic (built into Tone.Limiter)
- Attack/Release: Automatic (optimized for transparency)
Effect: Transparent limiting that prevents any sample from exceeding -0.5dB, ensuring the file never clips.
Purpose: Auto-leveling to ensure consistent loudness across different source files.
Implementation:
const normalizeAudioBuffer = (buffer) => {
const numberOfChannels = buffer.numberOfChannels;
const length = buffer.length;
let maxPeak = 0;
// Find absolute highest peak
for (let channel = 0; channel < numberOfChannels; channel++) {
const channelData = buffer.getChannelData(channel);
for (let i = 0; i < length; i++) {
const absSample = Math.abs(channelData[i]);
if (absSample > maxPeak) {
maxPeak = absSample;
}
}
}
// Calculate multiplier to reach -1.0 dB
const targetAmplitude = 0.89; // -1.0 dB = 20*log10(0.89)
const multiplier = maxPeak > 0 ? targetAmplitude / maxPeak : 1;
// Apply multiplier
for (let channel = 0; channel < numberOfChannels; channel++) {
const channelData = buffer.getChannelData(channel);
for (let i = 0; i < length; i++) {
channelData[i] *= multiplier;
}
}
return buffer;
};Target Level: -1.0 dB FS (0.89 linear amplitude)
Why -1.0 dB?:
- Safe headroom for codec encoding (MP3/AAC)
- Meets YouTube/Instagram loudness standards
- Prevents inter-sample peaks in lossy codecs
- Industry standard for mastering
Algorithm:
- Scan entire buffer to find highest peak
- Calculate scaling factor:
target / currentPeak - Multiply every sample by this factor
- Result: Peak of the loudest sample = -1.0 dB
Key Feature: Renders audio faster than real-time.
Implementation:
const renderedBuffer = await Tone.Offline(async ({ transport }) => {
// Set up processing chain
player.chain(highPass, multiband, limiter, Tone.Destination);
player.start(0);
}, duration);Performance:
- Real-time: 3-minute song = 3 minutes to process
- Tone.Offline: 3-minute song = ~60-90 seconds to process
- Speedup: ~2-3x faster than real-time on average hardware
Why It's Fast:
- No visualization rendering
- No UI updates during processing
- Optimized buffer operations
- Single-threaded but efficient Web Audio processing
File Size Limits:
- Practical limit: ~100MB audio files
- Theoretical limit: Browser memory (typically 2-4GB)
Memory Usage:
- Input file: Loaded once into ArrayBuffer
- Decoded audio: Float32Array per channel
- Processing: Minimal additional memory (in-place where possible)
- Output: New AudioBuffer (same size as input)
Cleanup:
audioContext.close(); // Free Web Audio resources
URL.revokeObjectURL(processedAudioUrl); // Free blob URLFormats:
- WAV (PCM, 16/24/32-bit)
- MP3 (CBR/VBR, any bitrate)
- Other formats supported by browser (FLAC, OGG, etc.)
Sample Rates:
- Common: 44.1kHz, 48kHz
- High-res: 88.2kHz, 96kHz, 192kHz
- Automatically preserved (no resampling)
Format: WAV (PCM)
- Bit Depth: 16-bit signed integer
- Sample Rate: Same as input (no resampling)
- Channels: Same as input (mono/stereo)
- Byte Order: Little-endian (standard WAV)
WAV Header Implementation:
// RIFF Header
writeString(0, 'RIFF');
view.setUint32(4, 36 + length, true); // File size
writeString(8, 'WAVE');
// Format Chunk
writeString(12, 'fmt ');
view.setUint32(16, 16, true); // Format chunk size
view.setUint16(20, 1, true); // PCM format
view.setUint16(22, numberOfChannels, true);
view.setUint32(24, sampleRate, true);
view.setUint32(28, sampleRate * numberOfChannels * 2, true);
view.setUint16(32, numberOfChannels * 2, true);
view.setUint16(34, 16, true); // 16-bit
// Data Chunk
writeString(36, 'data');
view.setUint32(40, length, true);States:
'idle' // No file loaded or ready to process
'processing' // Audio is being processed
'ready' // Processed audio ready for playback/download
'playing' // Audio is currently playingState Flow:
idle → [Upload File] → idle
idle → [Process] → processing → ready
ready → [Play] → playing → [End/Pause] → ready
ready → [Upload New] → idle
Milestones:
- 10%: File loaded
- 30%: Audio decoded
- 40%: Offline rendering started
- 70%: Offline rendering complete
- 90%: Normalization complete
- 100%: WAV conversion complete
Note: Tone.Offline doesn't provide granular progress, so we use estimated milestones.
More Aggressive Rumble Control:
low: {
threshold: -30, // Lower threshold
ratio: 6, // Higher ratio
attack: 0.02, // Faster attack
}Gentler Processing:
low: {
threshold: -18, // Higher threshold
ratio: 2, // Lower ratio
release: 0.5, // Slower release
}// Current: -1.0 dB (0.89)
const targetAmplitude = 0.89;
// -0.5 dB (Louder, for streaming)
const targetAmplitude = 0.944;
// -2.0 dB (Quieter, more conservative)
const targetAmplitude = 0.794;
// -3.0 dB (Broadcast standard)
const targetAmplitude = 0.708;For true loudness metering (LUFS), you would need to:
- Implement BS.1770 K-weighting filter
- Calculate integrated loudness over entire file
- Normalize to target LUFS (e.g., -14 LUFS for streaming)
Note: This is complex and beyond the scope of this implementation, but possible with Web Audio API.
- Quiet vocal: Test normalization (should boost significantly)
- Loud vocal: Test limiter (should catch peaks)
- Bassy vocal: Test multiband compression (should tame lows)
- Clean vocal: Test transparency (should sound natural)
- No clipping: Check waveform peaks in DAW (should be at -1.0 dB)
- No pumping: Listen for artifacts in sustained notes
- Low-end control: Compare low frequencies before/after
- Transparency: A/B test against unprocessed (vocal should sound natural)
- Small file (30 seconds): Should process in ~10-15 seconds
- Medium file (3 minutes): Should process in ~60-90 seconds
- Large file (10 minutes): Should process in ~3-5 minutes
- High-Pass Filtering: Remove subsonic frequencies
- Dynamic Range Compression: Reduce level differences
- Multiband Processing: Frequency-selective processing
- Limiting: Prevent clipping with transparent gain reduction
- Normalization: Standardize peak levels
- AudioContext: Main audio processing context
- AudioBuffer: Container for decoded audio
- AudioNode: Building block of processing chain
- OfflineAudioContext: Non-real-time rendering
- "Mixing Secrets" by Mike Senior: Comprehensive mixing guide
- "Mastering Audio" by Bob Katz: Industry standard reference
- Web Audio API Spec: W3C documentation
- Web Audio API: Native browser audio processing
- Tone.js: High-level Web Audio wrapper
- AudioWorklet: Custom DSP in Web Audio (advanced)
- WebAssembly: High-performance audio processing
- Real-time preview: Process and preview while uploading
- Batch processing: Process multiple files at once
- Custom presets: Save/load processing settings
- A/B comparison: Toggle between original and processed
- Spectrum analyzer: Visual feedback during processing
- LUFS metering: True loudness normalization
- Export formats: MP3, OGG, FLAC export options
- Cloud processing: Offload to server for faster processing
- AudioWorklet: Move processing to separate thread
- Web Workers: Parallelize normalization/WAV conversion
- Progressive loading: Stream large files instead of loading entirely
- Waveform visualization: Canvas-based waveform display
- Undo/Redo: Processing history and rollback
✅ Good for:
- Podcast vocals
- YouTube voiceovers
- Social media content
- Quick vocal cleanup
- Batch processing similar content
❌ Not ideal for:
- Music mixing (too aggressive)
- Mastering (needs more control)
- Broadcast (requires LUFS compliance)
- Live performance (needs real-time processing)
Similar to:
- iZotope RX's Voice De-noise
- Adobe Podcast Enhance
- Descript Studio Sound
- Auphonic (but simpler)
Advantages:
- Free and open-source
- Runs entirely in browser
- Instant (no upload/download)
- Customizable code
Limitations:
- Less sophisticated algorithms
- No AI/ML enhancement
- Limited format support
- No spectral editing
License: MIT
Dependencies:
- React: MIT
- Tone.js: MIT
- Lucide React: ISC
- Tailwind CSS: MIT
Credits:
- Tone.js by Yotam Mann
- Web Audio API by W3C
- Audio engineering principles: Industry standard practices
Built with ❤️ for audio engineers and content creators