A simple script to generate 2D lip-sync frame data from:
- OpenAI Whisper JSON output
- Audio files (
.wav,.ogg) via Rhubarb Lip Sync
This script supports two input modes:
- Whisper JSON mode
- Audio mode (Rhubarb)
For Whisper JSON mode, you can generate input with a command like this:
$ whisper audio.flac --model large-v3 --language English --word_timestamps True --output_format jsonThe script should work on all systems.
- Whisper JSON mode requires eSpeak NG.
- Audio mode requires Rhubarb Lip Sync executable in
PATH(rhubarbcommand available).
Install eSpeak NG according to your operating system. For Linux, it's usually a single command; for Windows, please refer to the eSpeak NG documentation.
$ # If using uv
$ uv sync
$ # If using pip
$ pip install -r requirements.txtWhisper JSON mode:
$ python main.py audio.json -l enAudio mode (Rhubarb):
$ python main.py audio.wav--frame-f: Target frame rate in Blender (default:30)--min-gap-seconds-g: Minimum interval between keyframes in seconds (default:0.18)--silence-seconds-s: Duration of silence keyframes in seconds (default:0.22)--max-duration-seconds: Maximum duration of a non-silence keyframe in seconds (default:0, disabled)--viseme_map-m: Path to the viseme mapping file (default:viseme_map.json)--language-l: Language code,zhfor Chinese,enfor English (default:en)--output-o: Path to the output keyframe data file (default:output.txt)
Rhubarb-specific behavior:
- If input file extension is
.wavor.ogg, the script runs Rhubarb mode automatically. - If
--viseme_mapis left as default (viseme_map.json) andrhubarb_map.jsonexists, the script automatically usesrhubarb_map.json.
You need to switch to the Scripting tab in Blender and run the script once to enable the side panel.
The panel has 3 parts:
- Right-click on the property you want to keyframe (e.g., mouth shape index), select Copy Full Data Path, and paste it here.
- Adjust the keyframe offset if needed.
- Specify the generated keyframe data file.
- Whisper JSON mode:
viseme_map.jsonis designed for poimiku mouth textures.viseme_map2.jsonis designed for default Uma Musume mouth textures.
- Rhubarb mode:
rhubarb_map.jsonis designed for poimiku mouth textures.rhubarb_map2.jsonis designed for default Uma Musume mouth textures.
- poimiku for providing mouth textures and support.
- Blender Lip Sync Addon by Charley 3D
- Improved viseme generation algorithm, added Pinyin-to-phoneme conversion for Chinese.
- Added
--languageparameter to specify language (zhfor Chinese,enfor English).
- Fixed issues where
--outputand input data were not working correctly. - Improved documentation.
- Fixed missing silence keyframes.
- Handled segments where not all keyframes could be placed by implementing frame-skipping instead of truncation.
- Updated Python version in
.python-versionto 3.10. - Updated
示例.blend.
- Initial release with basic functionality.