🔬 Technical Documentation - Vocal Mixer Pro

In-depth technical documentation for audio engineers and developers.

🎚️ Audio Processing Architecture

Overview

The application uses Tone.js (a Web Audio API wrapper) to implement a professional audio processing chain. The key innovation is using Tone.Offline for faster-than-real-time rendering, allowing users to download processed files without waiting for playback duration.

Processing Flow

Input File (WAV/MP3)
    ↓
Decode to AudioBuffer
    ↓
Tone.Offline Rendering Context
    ↓
High-Pass Filter (30Hz)
    ↓
Multiband Compressor (Dynamic Rumble Control)
    ↓
Limiter (-0.5dB)
    ↓
Rendered AudioBuffer
    ↓
Post-Process Normalization (-1.0dB)
    ↓
Convert to WAV Blob
    ↓
Download / Playback

🎛️ Processing Chain Details

1. High-Pass Filter

Purpose: Remove imperceptible sub-bass frequencies that waste headroom and cause muddiness.

Implementation:

const highPass = new Tone.Filter({
  type: 'highpass',
  frequency: 30,      // Hz
  rolloff: -24,       // dB/octave
});

Parameters:

Frequency: 30Hz (below human hearing threshold for most systems)
Rolloff: -24dB/octave (aggressive rolloff for clean removal)
Q Factor: Default (Butterworth response)

Effect: Removes frequencies below 30Hz with steep slope, preventing sub-bass from taking up valuable headroom in the final mix.

2. Multiband Compressor (Dynamic Rumble Control)

Purpose: Apply heavy compression only to low frequencies while leaving vocal frequencies transparent.

Implementation:

const multiband = new Tone.MultibandCompressor({
  low: {
    threshold: -24,    // dB
    ratio: 4,          // 4:1
    attack: 0.03,      // seconds
    release: 0.25,     // seconds
  },
  mid: {
    threshold: -6,
    ratio: 1.5,
    attack: 0.02,
    release: 0.15,
  },
  high: {
    threshold: -6,
    ratio: 1.5,
    attack: 0.02,
    release: 0.15,
  },
  lowFrequency: 60,       // Hz
  highFrequency: 10000,   // Hz
});

Band Split Points:

Low Band: DC to 60Hz (heavy compression)
Mid Band: 60Hz to 10kHz (light compression for presence)
High Band: 10kHz to Nyquist (light compression for air)

Low Band Parameters (Critical for rumble control):

Threshold: -24dB (catches even quiet rumble)
Ratio: 4:1 (significant reduction)
Attack: 30ms (fast enough to catch transients)
Release: 250ms (smooth release to avoid pumping)

Why This Works: This simulates a "dynamic EQ" by compressing only the problematic frequency range when it gets too loud, rather than applying a static EQ cut that would affect all content equally.

3. Safety Limiter

Purpose: Catch peaks and prevent clipping after processing.

Implementation:

const limiter = new Tone.Limiter(-0.5); // dB

Parameters:

Threshold: -0.5dB FS (0.5dB of headroom)
Look-Ahead: Automatic (built into Tone.Limiter)
Attack/Release: Automatic (optimized for transparency)

Effect: Transparent limiting that prevents any sample from exceeding -0.5dB, ensuring the file never clips.

4. Post-Process Normalization

Purpose: Auto-leveling to ensure consistent loudness across different source files.

Implementation:

const normalizeAudioBuffer = (buffer) => {
  const numberOfChannels = buffer.numberOfChannels;
  const length = buffer.length;
  let maxPeak = 0;

  // Find absolute highest peak
  for (let channel = 0; channel < numberOfChannels; channel++) {
    const channelData = buffer.getChannelData(channel);
    for (let i = 0; i < length; i++) {
      const absSample = Math.abs(channelData[i]);
      if (absSample > maxPeak) {
        maxPeak = absSample;
      }
    }
  }

  // Calculate multiplier to reach -1.0 dB
  const targetAmplitude = 0.89; // -1.0 dB = 20*log10(0.89)
  const multiplier = maxPeak > 0 ? targetAmplitude / maxPeak : 1;

  // Apply multiplier
  for (let channel = 0; channel < numberOfChannels; channel++) {
    const channelData = buffer.getChannelData(channel);
    for (let i = 0; i < length; i++) {
      channelData[i] *= multiplier;
    }
  }

  return buffer;
};

Target Level: -1.0 dB FS (0.89 linear amplitude)

Why -1.0 dB?:

Safe headroom for codec encoding (MP3/AAC)
Meets YouTube/Instagram loudness standards
Prevents inter-sample peaks in lossy codecs
Industry standard for mastering

Algorithm:

Scan entire buffer to find highest peak
Calculate scaling factor: target / currentPeak
Multiply every sample by this factor
Result: Peak of the loudest sample = -1.0 dB

🚀 Performance Optimization

Tone.Offline Rendering

Key Feature: Renders audio faster than real-time.

Implementation:

const renderedBuffer = await Tone.Offline(async ({ transport }) => {
  // Set up processing chain
  player.chain(highPass, multiband, limiter, Tone.Destination);
  player.start(0);
}, duration);

Performance:

Real-time: 3-minute song = 3 minutes to process
Tone.Offline: 3-minute song = ~60-90 seconds to process
Speedup: ~2-3x faster than real-time on average hardware

Why It's Fast:

No visualization rendering
No UI updates during processing
Optimized buffer operations
Single-threaded but efficient Web Audio processing

Memory Management

File Size Limits:

Practical limit: ~100MB audio files
Theoretical limit: Browser memory (typically 2-4GB)

Memory Usage:

Input file: Loaded once into ArrayBuffer
Decoded audio: Float32Array per channel
Processing: Minimal additional memory (in-place where possible)
Output: New AudioBuffer (same size as input)

Cleanup:

audioContext.close();  // Free Web Audio resources
URL.revokeObjectURL(processedAudioUrl);  // Free blob URL

📊 Audio Quality Specifications

Input Support

Formats:

WAV (PCM, 16/24/32-bit)
MP3 (CBR/VBR, any bitrate)
Other formats supported by browser (FLAC, OGG, etc.)

Sample Rates:

Common: 44.1kHz, 48kHz
High-res: 88.2kHz, 96kHz, 192kHz
Automatically preserved (no resampling)

Output Specifications

Format: WAV (PCM)

Bit Depth: 16-bit signed integer
Sample Rate: Same as input (no resampling)
Channels: Same as input (mono/stereo)
Byte Order: Little-endian (standard WAV)

WAV Header Implementation:

// RIFF Header
writeString(0, 'RIFF');
view.setUint32(4, 36 + length, true);  // File size
writeString(8, 'WAVE');

// Format Chunk
writeString(12, 'fmt ');
view.setUint32(16, 16, true);          // Format chunk size
view.setUint16(20, 1, true);           // PCM format
view.setUint16(22, numberOfChannels, true);
view.setUint32(24, sampleRate, true);
view.setUint32(28, sampleRate * numberOfChannels * 2, true);
view.setUint16(32, numberOfChannels * 2, true);
view.setUint16(34, 16, true);          // 16-bit

// Data Chunk
writeString(36, 'data');
view.setUint32(40, length, true);

🎨 UI/UX Architecture

State Management

States:

'idle'       // No file loaded or ready to process
'processing' // Audio is being processed
'ready'      // Processed audio ready for playback/download
'playing'    // Audio is currently playing

State Flow:

idle → [Upload File] → idle
idle → [Process] → processing → ready
ready → [Play] → playing → [End/Pause] → ready
ready → [Upload New] → idle

Progress Tracking

Milestones:

10%: File loaded
30%: Audio decoded
40%: Offline rendering started
70%: Offline rendering complete
90%: Normalization complete
100%: WAV conversion complete

Note: Tone.Offline doesn't provide granular progress, so we use estimated milestones.

🔧 Customization Guide

Adjusting Compression Characteristics

More Aggressive Rumble Control:

low: {
  threshold: -30,  // Lower threshold
  ratio: 6,        // Higher ratio
  attack: 0.02,    // Faster attack
}

Gentler Processing:

low: {
  threshold: -18,  // Higher threshold
  ratio: 2,        // Lower ratio
  release: 0.5,    // Slower release
}

Changing Normalization Target

// Current: -1.0 dB (0.89)
const targetAmplitude = 0.89;

// -0.5 dB (Louder, for streaming)
const targetAmplitude = 0.944;

// -2.0 dB (Quieter, more conservative)
const targetAmplitude = 0.794;

// -3.0 dB (Broadcast standard)
const targetAmplitude = 0.708;

Adding LUFS Metering (Advanced)

For true loudness metering (LUFS), you would need to:

Implement BS.1770 K-weighting filter
Calculate integrated loudness over entire file
Normalize to target LUFS (e.g., -14 LUFS for streaming)

Note: This is complex and beyond the scope of this implementation, but possible with Web Audio API.

🧪 Testing Recommendations

Test Files

Quiet vocal: Test normalization (should boost significantly)
Loud vocal: Test limiter (should catch peaks)
Bassy vocal: Test multiband compression (should tame lows)
Clean vocal: Test transparency (should sound natural)

Quality Checks

No clipping: Check waveform peaks in DAW (should be at -1.0 dB)
No pumping: Listen for artifacts in sustained notes
Low-end control: Compare low frequencies before/after
Transparency: A/B test against unprocessed (vocal should sound natural)

Performance Testing

Small file (30 seconds): Should process in ~10-15 seconds
Medium file (3 minutes): Should process in ~60-90 seconds
Large file (10 minutes): Should process in ~3-5 minutes

📚 References

Audio Engineering Concepts

High-Pass Filtering: Remove subsonic frequencies
Dynamic Range Compression: Reduce level differences
Multiband Processing: Frequency-selective processing
Limiting: Prevent clipping with transparent gain reduction
Normalization: Standardize peak levels

Web Audio API

AudioContext: Main audio processing context
AudioBuffer: Container for decoded audio
AudioNode: Building block of processing chain
OfflineAudioContext: Non-real-time rendering

Tone.js Documentation

🎓 Learning Resources

Related Technologies

Web Audio API: Native browser audio processing
Tone.js: High-level Web Audio wrapper
AudioWorklet: Custom DSP in Web Audio (advanced)
WebAssembly: High-performance audio processing

🚧 Future Enhancements

Potential Features

Real-time preview: Process and preview while uploading
Batch processing: Process multiple files at once
Custom presets: Save/load processing settings
A/B comparison: Toggle between original and processed
Spectrum analyzer: Visual feedback during processing
LUFS metering: True loudness normalization
Export formats: MP3, OGG, FLAC export options
Cloud processing: Offload to server for faster processing

Technical Improvements

AudioWorklet: Move processing to separate thread
Web Workers: Parallelize normalization/WAV conversion
Progressive loading: Stream large files instead of loading entirely
Waveform visualization: Canvas-based waveform display
Undo/Redo: Processing history and rollback

💡 Tips for Audio Engineers

When to Use This Tool

✅ Good for:

Podcast vocals
YouTube voiceovers
Social media content
Quick vocal cleanup
Batch processing similar content

❌ Not ideal for:

Music mixing (too aggressive)
Mastering (needs more control)
Broadcast (requires LUFS compliance)
Live performance (needs real-time processing)

Comparison to Professional Tools

Similar to:

iZotope RX's Voice De-noise
Adobe Podcast Enhance
Descript Studio Sound
Auphonic (but simpler)

Advantages:

Free and open-source
Runs entirely in browser
Instant (no upload/download)
Customizable code

Limitations:

Less sophisticated algorithms
No AI/ML enhancement
Limited format support
No spectral editing

📄 License & Attribution

License: MIT

Dependencies:

React: MIT
Tone.js: MIT
Lucide React: ISC
Tailwind CSS: MIT

Credits:

Tone.js by Yotam Mann
Web Audio API by W3C
Audio engineering principles: Industry standard practices

Built with ❤️ for audio engineers and content creators

FilesExpand file tree

TECHNICAL.md

Latest commit

History

TECHNICAL.md

File metadata and controls

🔬 Technical Documentation - Vocal Mixer Pro

🎚️ Audio Processing Architecture

Overview

Processing Flow

🎛️ Processing Chain Details

1. High-Pass Filter

2. Multiband Compressor (Dynamic Rumble Control)

3. Safety Limiter

4. Post-Process Normalization

🚀 Performance Optimization

Tone.Offline Rendering

Memory Management

📊 Audio Quality Specifications

Input Support

Output Specifications

🎨 UI/UX Architecture

State Management

Progress Tracking

🔧 Customization Guide

Adjusting Compression Characteristics

Changing Normalization Target

Adding LUFS Metering (Advanced)

🧪 Testing Recommendations

Test Files

Quality Checks

Performance Testing

📚 References

Audio Engineering Concepts

Web Audio API

Tone.js Documentation

🎓 Learning Resources

Recommended Reading

Related Technologies

🚧 Future Enhancements

Potential Features

Technical Improvements

💡 Tips for Audio Engineers

When to Use This Tool

Comparison to Professional Tools

📄 License & Attribution