Skip to content

dotslashgabut/immersive-audio-player-lyric-video-maker

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

174 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Immersive Audio Player & Lyric Video Maker

Version 2.3.16

Immersive Audio Player & Lyric Video Maker is a powerful all-in-one web tool designed for music lovers and content creators. It combines a distraction-free audio player with a professional-grade timeline editor, allowing you to create stunning, synchronized lyric videos directly in your browser. Whether you want to enjoy your local music library with beautiful visuals or create viral content for TikTok, Instagram, and YouTube, this app delivers high-quality results without watermarks or server uploads.

Online version : via GitHub Page | Vercel | Netlify | Huggingface Space

Immersive Audio Player Main Screen

Immersive Audio Player Main Screen

Key Features

🎧 Advanced Audio Player & Playlist

  • Smart Playlist: Drag & drop multiple files. Automatically groups audio/video files with matching lyric files (.lrc, .srt, .vtt, .ttml, .xml) based on filename.
  • Broad Audio Format Support: Handles MP3, M4A/ALAC, FLAC, OGG Vorbis, OPUS, WAV, and video files. Automatically extracts embedded metadata (title, artist, cover art) and embedded lyrics from all supported formats.
  • Interactive Lyric Timeline: Each playlist item shows a mini-timeline of lyrics. Click any line to play that specific track starting from that time.
  • Sorting & Management: Sort by Filename, Artist, Title, Album (Ascending/Descending), or Shuffle.
  • Repeat Modes: Cycle between Off, Repeat One, Play All (Play playlist once), and Repeat All (Loop playlist).
  • Immersive Mode: UI controls automatically fade out when idle for a distraction-free experience.
  • Minimal Mode Interaction: Double-click or double-tap anywhere on the screen to quickly exit Minimal Mode.
  • Smart Play/Pause: Clicking the active track in the playlist toggles Play/Pause.
  • Unified Uploads: Audio files and lyrics uploaded from the main interface are instantly auto-synced to your active playlist queue.
  • Auto-Load: Automatically cues the first track for playback when adding new audio files.
  • Export Playlist: Export your current playlist to .m3u8 format.

Playlist Management

📝 Synchronized Lyrics

  • Multi-Format Support: Compatible with .lrc (Standard & Enhanced with newline support), .srt (Subtitle style), .vtt (WebVTT with Word-Level timestamps), and .ttml / .xml (Word-level precision).
  • Auto-Scroll & Centering: Lyrics scroll automatically and active lines are perfectly centered.
  • Lyric Offset Control: Fine-tune sync issues with +0.1s / -0.1s buttons directly in the player.
  • Click-to-Seek & Copy: Click any lyric line to instantly jump to that exact time. Click the active line to copy its text to the clipboard.
  • Adjustable Font Size: Use + / - hotkeys to adjust lyric size on the fly.
  • Visual Presets: Choose from a wide variety of display styles:
    • Standard: Default, Big Text, Big Text (UP), Big Center (UP).
    • Thematic: Metal, Kids, Sad, Romantic, Tech, Gothic, Classic Serif, Monospace.
    • Social:
      • One Line / One Line (UP): Minimalist single-line display (Modern TikTok/Reels style).
      • Slideshow: Minimalist line-by-line.
      • Subtitle: Cinematic bottom-centered text (No song metadata).
      • Just Video: Hides text content.
      • None: Hides everything (Lyrics, Titles, and Metadata).
    • Karaoke Styles: Neon, Glow, Bounce, Wave, and Color/Shape highlights (Pill, Box, Rounded) with customizable colors. Extended with 20+ new word-level effects: Scale, Underline, Outline, Shadow, Gradient, Glass, Neon Multi, Soft Glow, 3D, Emboss, Chrome, Gold, Fire, Frozen, Rainbow, Mirror, VHS, Retro, Cyberpunk, Hologram, Comic, Glitch Text, Pulse, Breathe, and Float.
    • Text Animations: Bounce, Pulse, Wave, Glitch, Shake, Typewriter, Heartbeat, Tada, Jello, and more.
    • Display Modes: Control lyric visibility context and Style Target (Apply styles to Active Line only or All Lines).
  • AI Transcription: Built-in integration with Google Gemini (supports Gemini 2.5 Flash and 3.0 Flash Preview) for automatic high-accuracy audio transcription.
    • Mixed-Language Support: Expertly handles code-switching (e.g., K-Pop, J-Pop with English) and preserves native scripts (No forced transliteration).
    • Line/Word: Use Line for standard formats, and Word for precise karaoke highlighting (Enhanced).
  • Export Lyrics: Save your transcribed or edited lyrics as .lrc (Standard or Enhanced/Karaoke), .srt, .vtt (Standard or Karaoke), .json, or .txt files.
  • Online Lyric Search: Built-in search engine to find and download synchronized lyrics from LRCLIB with a single click. Netease and Musixmatch (maybe not working).
  • Manual Management: Manually load external .lrc / .srt files or clear existing lyrics for any track.
  • Embedded Lyrics Extraction: Automatically reads lyrics embedded inside audio files on import. Supports:
    • MP3: ID3v2 USLT (Unsynchronized Lyrics) tag.
    • M4A / ALAC: MP4 ©lyr atom.
    • FLAC: Vorbis Comment LYRICS / UNSYNCEDLYRICS fields (via custom binary parser).
    • OGG Vorbis / OPUS: Vorbis Comment lyrics (via custom binary parser).
    • WAV: RIFF LIST INFO lyrics chunks.
    • Reload Button: Re-extract embedded lyrics at any time using the ↻ button in the playlist.

🎬 Professional Visual Timeline Editor

Create complex visual stories synchronized to your music.

  • Multi-Track Layering:
    • 2 Visual Tracks: Layer videos and images (Background + Overlay).
    • 2 Audio Tracks: Layer sound effects or secondary audio.
  • Full Editing Suite:
    • Undo/Redo (Ctrl+Z / Ctrl+Y): Never worry about mistakes.
    • Copy/Cut/Paste (Ctrl+C / Ctrl+X / Ctrl+V): Duplicate slides easily.
    • Snapping: Clips snap to each other, lyric timestamps, grid lines, and playhead for pixel-perfect timing.
    • Video Looping: Automatically loops video clips when extended beyond their original duration.
  • Advanced Selection:
    • Drag Selection: Click and drag on the timeline background to interpretively select multiple clips.
    • Shift + Click for range selection.
    • Ctrl + Click for multi-selection.
  • Precise Control:
    • Arrow Keys: Nudge selected clips by 0.1s.
    • Lyric Navigation: Click any lyric block on the timeline to jump the player to that exact time.
    • Volume Mixing: Independent mute/volume control for every video and audio clip on the timeline.
    • Playback Speed: Precise manual control for playback rate (0.25x - 2.0x) on individual video and audio clips.
  • Visual Effects:
    • Background Blur: Toggle between 'Sharp' and 'Blur' for background media.

Timeline Editor

🎥 Content Creation & Export

  • Flexible Render Settings:
    • Render Engine:
      • Browser Recorder: Fast, real-time capture.
      • WebCodecs (New): Hardware-accelerated rendering for maximum speed and performance.
      • FFmpeg WASM: Professional frame-by-frame rendering (No dropped frames).
    • Resolution: 720p / 1080p.
    • Frame Rate: Support for 24, 30, and 60 FPS.
    • Quality: Adjustable bitrate presets (Low, Med, High).
    • Visual Toggles: Quick toggle for Background Blur with Adjustable Blur Strength directly in the export panel.
    • Codecs: Full control over output format, supporting H.264 (MP4), VP9 (WebM), and AV1 (where supported).
  • Aspect Ratios:
    • 16:9 (Landscape - YouTube)
    • 9:16 (Vertical - TikTok/Reels)
    • 1:1 (Square - Instagram)
    • 4:5 (Portrait - Instagram Feed)
    • 3:4 (Portrait - Standard)
    • 4:3 (Landscape - Classic TV)
    • 2:3 (Portrait - Digital Photography)
    • 3:2 (Landscape - Classic Photo)
    • 2:1 (Landscape - Panoramic)
    • 20:9 (Landscape - Modern Mobile, Cinematic)
    • 21:9 (Landscape - Ultra Widescreen)
    • 1:2 (Vertical - Split Screen)
  • Rendering Scope: Batch export the entire playlist or just the current song.
  • Settings Management:
    • JSON Import/Export: Backup, share, or restore your exact render configurations (including custom fonts, colors, and layouts) with a single click.
    • Reset to Defaults: Improved reset functionality to verify and clear all custom font selections and return to factory defaults.
  • Audio Visualization:
    • Live & Export: Real-time frequency analysis that works in the player AND is baked into your exported videos (WebCodecs/FFmpeg).
    • Visualizer Types: Bars, Wave, Waveform (Filled), Spectrum, Spectrogram, Stereo Field, Circular, Pulse Ring, and Particles.
    • Customization: Full control over Position (Bottom, Center, Full), Colors (Gradient, Rainbow, Accent), Opacity, and Sensitivity.
  • Advanced Backgrounds:
    • Timeline Media: Use your custom images and videos from the timeline.
    • Custom Image: Direct support for single custom background images.
    • Smart Gradient: Generate beautiful gradients from a single color.
    • Custom Gradient: Manually define complex linear gradients.
    • Solid Color: Simple, clean backgrounds.
    • Gradient Overlay: Optional bottom-up fade for better text readability.
    • Real Color Media: Toggle to disable the default dark dimming overlay, allowing background visual colors to pop.
    • Three.js 3D Background (New): 15 fully animated, WebGL-powered 3D scenes — Stars, Cubes, Waves, Particles, Galaxy, DNA Helix, Aurora, Matrix, Nebula, Rings, Tunnel, Warp, CyberGrid, Vortex, and Crystals — with customizable color, speed, and cinematic auto-camera movement. Fully composited into video exports.
  • Lyric Display Control:
    • Modes: Show All, Previous/Next, Next Only, or Active Line Only highlighting.
    • Visibility: Toggle Lyrics, Title, Artist, Cover Art, and Intro overlays independently.
    • Intro Overlay:
      • Modes: Toggle 'Auto' (MetaData triggers on every song) or 'Manual' (Custom text triggers only on first playlist song).
  • Typography & Visual Effects:
    • Font Freedom: Huge library of built-in fonts (Sans, Serif, Display, Handwriting, etc.), Custom Font Upload, and Individual Google Font loading for separate elements.
    • Text Styling: Controls for alignment, vertical position (adjustable Top/Bottom margins), size, and color.
    • Text Transformations: Uppercase, Lowercase, Title Case, Sentence Case, and Invert Case overrides.
    • Effects Library:
      • Text Effects: Shadow, Glow, Neon, 3D Pop, Glitch, Retro, and more.
      • Animations: Pulse, Bounce, Wave, Shake, Typewriter, Heartbeat, Flash, Spin, and more.
      • Transitions: Smooth Fade, Slide, Zoom, Flip, Motion Blur, Spiral, Shatter, and None (Instant Cut).
  • Song & Channel Info Customization:
    • Positioning: 9-point grid automatic positioning (Top-Left to Bottom-Right).
    • Song Styles: Classic (Detailed), Modern + Cover, Circle Cover, Boxed Cover, Minimal (Text Only).
    • Channel Styles: Classic, Modern, Minimal, Logo Only (SVG Support), Boxed.
    • Typography Control: Custom Font Family, Weight (Bold), Style (Italic), and Custom Color for both Channel and Song Info.
    • Padding: Fine-tune margin from screen edges for both Song Info and Channel Info (Visible in Minimal Mode).
  • Smart Overlays: Automatically generates metadata overlays for "Now Playing" visuals.
  • Dynamic Blur: Apply real-time Gaussian-style blur to background images or videos for better lyric readability.
  • High-Fidelity Rendering: Native Canvas 2D rendering allows for pixel-perfect text at any frame rate.

Render Settings Panel

Render Settings Panel Output Settings

Export Interface

🛠 Technical Architecture

  • Dual Rendering Engine:
    • MediaRecorder (Real-time): Captures the canvas stream in real-time. Fast for simple exports.
    • WebCodecs API: Uses the modern VideoEncoder API to access hardware encoding capabilities directly from the browser, offering a middle-ground between speed and quality.
    • FFmpeg WASM (Offline/High-Quality): Client-side video encoding using FFmpeg compiled to WebAssembly. Supports frame-by-frame rendering for perfect synchronization and higher bitrates, regardless of computer speed.
    • Robust Path Discovery: Automatically probes multiple local paths (Absolute, Relative, BaseURL, Origin) to find core binaries in any hosting environment.
    • Binary Integrity: Uses binary signature verification (\0asm check) to prevent application-level redirects from being mistaken for valid assets.
    • Offline Persistence: Leverages PWA Service Workers with an expanded 50MB cache limit to ensure professional rendering is available anywhere.
    • Smart Fallback: Automatically switches between Multi-threaded (SharedArrayBuffer) and Single-threaded cores based on browser capabilities and server headers (COOP/COEP).
  • Typography Engine:
    • Google Fonts Offline: Integrated Workbox runtime caching to ensure all typography presets and custom Google Fonts are preserved for offline content creation.
  • Rendering Pipeline: Highly optimized Canvas 2D pipeline. Text rendering is handled natively by the browser, ensuring perfect clarity. Advanced effects (Fire, VHS, etc.) are simulated in Canvas to match CSS visuals during export.
  • OffscreenCanvas Ready: The application structure decouples rendering logic from the UI thread (utils/canvasRenderer.ts). This architecture supports OffscreenCanvas and Web Workers.
  • WebGPU Note: While not currently used, the modular design allows for a hybrid approach in the future (e.g., using WebGPU for background visualizers while keeping Canvas 2D for crisp text overlays).

Related Tools

Check out our other AI-powered tools for music and visuals:

Keyboard Shortcuts

Key Function
N Play Next Song
B Play Previous Song
Space / K Play / Pause
Arrow Left / Right Rewind / Forward 5s (Player) OR Nudge Slide (Editor)
Arrow Up / Down Navigate Playlist Tracks (Playlist) OR Scroll Lyrics (Player)
M Toggle Mute
R Toggle Repeat Mode (Off -> All -> One)
J Next Visual Preset
+ / - Increase / Decrease Lyric Font Size
S Stop & Reset
X Split Clip at Playhead (Timeline Editor)
H Toggle UI Auto-Hide Inhibit
G Cycle Lyric Display Mode
C Cycle Text Case
L Toggle Playlist View
T Open/Close Timeline Editor
D Toggle Render Settings Panel
I Toggle Header Info
O Toggle Minimal Mode
Y Toggle Shortcut Info Overlay
P Toggle Player Controls
X Toggle Highlight Effect On/Off
Z Cycle Next Highlight Effect
F Toggle Fullscreen
Delete Remove selected Playlist Item or Visual Slide
Ctrl + Z / Y Undo / Redo (Editor)
Ctrl + C / X / V Copy / Cut / Paste (Editor)
Escape Abort Video Rendering / Close Modal
Enter Confirm Dialog Action
Ctrl/Cmd + Shift + E Export / Render Video

Mouse & Touch Gestures

Action Function
Double Click / Tap Toggle Minimal Mode (O)
Click Lyric Line Seek to that timestamp
Click Active Lyric Copy lyric text to clipboard
Drag Timeline Select multiple clips (Editor)
Shift + Click Range Selection (Editor)
Ctrl + Click Multi-Selection (Editor)

Installation

This application is built with React + Vite and requires Node.js to run.

  1. Prerequisites: Install Node.js (LTS version recommended).
  2. Install Dependencies:
    npm install
  3. Start the Server:
    npm run dev
  4. Launch: Open the URL shown in your terminal (usually http://localhost:5173) in Chrome or Edge.

Configuration (Optional)

To use the AI Transcription feature, you need a Google Gemini API Key.

  1. Get a free API Key from Google AI Studio.
  2. Create a file named .env.local in the root directory.
  3. Add your API key:
    GEMINI_API_KEY=your_api_key_here
    Alternatively, you can enter the API Key directly in the application UI (Config Icon).
  4. Restart Server: If the server is running, restart it (Ctrl+C then npm run dev) to load the new environment variables.

How to Use

  1. Build a Playlist:
    • Click "Add Audio & Lyrics" in the Playlist panel.
    • Select multiple audio files (MP3, M4A, FLAC, OGG, OPUS, WAV) and lyric files at once. The app will pair them automatically.
    • Embedded lyrics are extracted automatically from all supported formats on import.
  2. Design Visuals:
    • Press T to open the Timeline.
    • Drag & Drop media or use "Import Media".
    • Use Ctrl+C / Ctrl+V to duplicate sequences.
  3. Export Video:
    • Choose your resolution and aspect ratio (e.g., 9:16 for TikTok).
    • Click the Video Camera icon.
    • Wait for the render to complete (Video plays in real-time).

Panduan Pengguna (Bahasa Indonesia)

Fitur Utama Baru

  • Tipografi Super Canggih: Load Google Fonts Individu secara terpisah untuk Lirik, Info Lagu, dan Nama Channel. Termasuk kontrol granular untuk gaya (Bold/Italic) dan warna kustom tiap elemen.
  • Rendering Kilat (WebCodecs): Engine baru memanfaatkan akselerasi hardware (GPU) untuk ekspor video hingga 5x lebih cepat. Tetap tersedia opsi FFmpeg untuk kualitas frame-by-frame.
  • Manajemen Pengaturan Lengkap: Ekspor/Impor JSON untuk menyimpan/membagikan konfigurasi render (termasuk font kustom). Tombol Reset yang disempurnakan.
  • Playlist Pintar Terpadu: File yang diupload dari menu utama kini otomatis langsung tersinkron dan masuk ke dalam Playlist. Masukkan banyak file sekaligus via Playlist untuk otomatis dipasangkan dengan lirik berdasarkan nama.
  • Ekstraksi Lirik Tersemat: Fitur canggih untuk membaca lirik yang tertanam langsung di dalam file audio (FLAC Vorbis Lyrics, MP4 Lyrics, ID3v2 USLT, dll) secara otomatis saat import.
  • Timeline Lirik Interaktif: Lihat cuplikan lirik di playlist. Klik baris mana saja untuk langsung memutar lagu itu.
  • Format Luas: Dukungan lirik .lrc, .srt, .vtt (termasuk Karaoke/Word-Level), dan .ttml / .xml.
  • Support Video: Load file video (MP4, WebM) sebagai track audio + background otomatis.
  • Karaoke Kustom: Pilihan efek highlight baru (Neon, Glow, Box, Pill) dengan Pewarnaan Kustom (Text & Background Color).
  • Visual Efek: Fitur Background Blur (dengan Slider Kekuatan Blur), Smart Gradient, dan Text Animation (Bounce, Pulse, Glitch, dll).
  • Kustomisasi Info Lagu: Atur posisi, gaya tampilan, jarak tepi (padding), serta Intro teks (Auto/Manual).
  • AI Transkripsi: Dukungan model Gemini 2.5 Flash dan 3.0 Flash Preview untuk transkripsi audio otomatis yang presisi.
  • Ekspor Lirik: Simpan hasil transkripsi dalam format .lrc, .srt, .vtt, .json, atau .txt.
  • Timeline Canggih: Dukungan Multi-Layer (2 Visual + 2 Audio), Undo/Redo, Cut/Copy/Paste, Drag Selection, dan Snapping otomatis.
  • Shortcut Baru: Tekan D (Render Settings), O (Minimal Mode), Y (Info Shortcut), N/B (Next/Prev), M (Mute), R (Repeat), L (Playlist), +/- (Font), dan Ctrl/Cmd+Shift+E (Export).

Cara Install & Jalankan

Aplikasi ini berbasis web modern (React + Vite).

  1. Persiapan: Pastikan PC kamu sudah terinstall Node.js (Versi LTS disarankan).
  2. Install Dependensi: Buka terminal/CMD di folder project, lalu ketik:
    npm install
  3. Jalankan Aplikasi:
    npm run dev
  4. Buka Browser: Akses alamat lokal yang muncul (biasanya http://localhost:5173) menggunakan Chrome atau Edge.

Alur Kerja (Workflow)

  1. Siapkan Playlist:
    • Klik "Add Audio & Lyrics" atau cukup Drag & Drop file audio (MP3, M4A, FLAC, OGG, OPUS, WAV) dan file lirik kamu.
    • File akan otomatis berpasangan jika namanya sama, dan lirik tersemat akan diekstrak secara otomatis.
  2. Desain Visual (Editor Timeline):
    • Tekan T untuk membuka Timeline Editor.
    • Drag gambar/video ke track visual.
    • Gunakan Ctrl+C / Ctrl+V untuk duplikasi slide, dan Drag ujung klip untuk memanjangkan durasi (Video otomatis looping!).
  3. Render Settings
    • Tekan D untuk membuka Render Settings.
    • Pilih Preset tampilan yang sesuai.
    • Atur Desain Visual (Background, Text, dan lainnya).
    • Atur Layout Logo Channel, Nama Channel, Lirik, dan Info Lagu.
  4. Ekspor Video:
    • Tekan Ctrl/Cmd + Shift + E atau klik ikon Video.
    • Pilih Render Engine:
      • WebCodecs: Paling ngebut (menggunakan GPU). Cocok untuk preview cepat & upload sosmed.
      • FFmpeg: Kualitas maksimal (Frame-by-Frame). Gunakan jika butuh presisi tinggi.
    • Tentukan Resolusi (720p/1080p) dan Format Rasio (misal 9:16 untuk TikTok).
    • Klik Start Render dan tunggu hingga selesai.

About

Immersive Audio Player & Lyric Video Maker | A modern, single-page web application that transforms your music listening into a visual experience. Support LRC, Enhanced LRC, TTML, VTT, SRT formats.

Topics

Resources

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages