A proof-of-concept library to enable browser playback of multi-audio track MP4 files via the Web Audio API.
The library uses:
- RxJS for timing and events management
- MP4Box.js for MP4 file parsing
- WebAudio API for the audio output interface
The initial idea for this project came from watching streamers constantly struggle to balance their audio in a way which would satisfy all their viewers.
The obvious solution which came to mind was to stream multitrack media and allow for client-side audio mixing, especially since popular streaming protocols today (namely HLS and MPEG-DASH) support delivering media segments in (fragmented) MP4, and MP4s can and are often used to hold multiple audio tracks. However, it seems that — despite it being listed in WHATWG's media element specification — built-in browser support for multitrack media is still limited, with the Chromium team having indefinitely postponed implementing multitrack playback due to high cost and low priority (see 1 and 2).
I thus found myself interested in exploring other ways through which multi-audio track files could be demuxed, mixed, and played on the client side, leading me to building this library using the Web Audio API.
See examples here.
import { ScheduledBuffersPlayer } from 'multitrack-player';
// Create player from local MP4 file
const file = ...
const player = await ScheduledBuffersPlayer.fromBlob(file);
// Connect track outputs to desired destination node
// (Track outputs are initially not connected to anything)
for (const track of Object.values(player.tracks)) {
track.output.connect(player.context.destination);
}
// Link player methods to your controls
myPlayButton.onclick = () => player.play();
myTrackOffsetSlider.onchange = () => {
player.tracks[targetTrackId].setTrackState({
offset: myTrackOffsetSlider.value,
});
};View source code for exhaustive, up-to-date definitions.
The static fromBlob method accepts several configuration options when creating the player.
All properties described below are optional.
// Custom AudioContext (optional)
const audioContext = new AudioContext();
const player = await ScheduledBuffersPlayer.fromBlob(
file,
{
// View PlayerConfig definition for more details
config: {
// Limit each track's offset to a maximum of 1 second relative to the
// player's playback position
maxOffset: 1,
// Maintain a 5-second buffer ahead of the leadingmost potential
// playback position,
// i.e. read file up to {playback position + max offset + 5} seconds
readahead: 5,
// Read 5 seconds' worth of data per file read operation
secondsPerRead: 5,
// Wait readahead / 5 seconds before checking again whether next file
// chunk must be read
bufferCheckFrequency: 5,
// Let parser parse 10 samples at a time
samplesPerExtract: 10,
},
// The AudioContext used by the player and its AudioNodes
context: audioContext,
// Whether the player should log less important messages to the console
verbose: false,
},
);// Play
player.play();
// Pause
player.pause();
// Use setPlaybackState to set playback position, playback rate, or
// multiple playback parameters simultaneously
player.setPlaybackState({
paused: false,
playbackPosition: 3.14159,
playbackRate: 1.23,
});
// Inspect playback state
const { paused, playbackPosition, playbackRate } = await player.getPlaybackState();
// Get audio tracks, mapped by track ID
// { [trackId: number]: Track }
const tracks = player.tracks;// Get track whose ID is 2 in the source MP4 file
const track = player.tracks[2];
// Get the output AudioNode for this track
const outputNode = track.output;
// Example of passing track output to a custom processing node before
// sending the signal to the AudioContext output
const myProcessingNode = ...
outputNode
.connect(myProcessingNode)
.connect(player.context.destination);
// Set the track's offset relative to the player's internal playback position
// (offset magnitude may not exceed config.maxOffset)
track.setTrackState({ offset: 4 }); // 4 second delay
// Inspect track state
const { offset } = await track.getTrackState();