Skip to content

A small project that splits song to smaller pieces word by word

Notifications You must be signed in to change notification settings

justlx/songsplitter

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 

Repository files navigation

Song Extractor

This directory contains a Python script (songSplitter.py) to automatically transcribe an audio file, split it into individual word segments, and generate a JSON mapping file. This is useful for creating datasets for projects that need audio corresponding to specific words.

This one was used as a part of April Fools' day where Twitch chat could play song, word by word with typing their messages.

Requirements

  • Python 3.x
  • OpenAI Whisper: For audio transcription.
  • pydub: For audio manipulation (splitting).
  • ffmpeg: Required by pydub for handling various audio formats (like MP3). Ensure ffmpeg is installed and accessible in your system's PATH.

You can install the Python libraries using pip:

pip install -U openai-whisper pydub

Usage

The songSplitter.py script takes the path to an audio file as input and performs the transcription and splitting process.

python extractor/songSplitter.py <audio_file_path> [options]

Arguments

  • <audio_file_path>: (Required) Path to the input audio file (e.g., rickroll.mp3).
  • -o, --output_dir: Directory to save the segmented audio files (defaults to output).
  • -j, --json_path: Path to save the output JSON mapping file (defaults to splitsong.json).
  • -m, --model: Whisper model name to use for transcription (e.g., tiny, base, small, medium, large). Defaults to medium. Larger models are more accurate but require more resources (VRAM/RAM) and time.

Example

Let's say you have the Rick Roll song saved as rickroll.mp3 in the extractor directory parent directory. To process it using the base model and save the results in a directory named rickroll_words:

python extractor/songSplitter.py ../rickroll.mp3 -m base -o rickroll_words -j rickroll_map.json

Output

The script will generate:

  1. Segmented Audio Files: Inside the specified output directory (output or --output_dir), you will find numerous small MP3 files (e.g., 000.mp3, 001.mp3, 002.mp3, ...), each corresponding to a word detected in the original audio.
  2. JSON Mapping File: A JSON file (splitsong.json or --json_path) containing a list of objects, where each object maps a detected word (lowercase) to its corresponding audio segment file path.

Example splitsong.json structure:

[
  {
    "word": "we're",
    "sound": "output/000.mp3"
  },
  {
    "word": "no",
    "sound": "output/001.mp3"
  },
  {
    "word": "strangers",
    "sound": "output/002.mp3"
  },
  {
    "word": "to",
    "sound": "output/003.mp3"
  },
  // ... more words
]

About

A small project that splits song to smaller pieces word by word

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published