LatentSync 1.6

Many people have reported that the teeth and lips generated by LatentSync 1.5 are blurry. To address this issue, we trained LatentSync 1.6 on 512 $\times$ 512 resolution videos.

Notably, we did not make any changes to the model structure or training strategy; the only modification was upgrading the training dataset to 512 $\times$ 512 videos. Therefore, the current code is compatible with both LatentSync 1.5 and 1.6. To switch between versions, you only need to load the corresponding checkpoint and modify the resolution parameter in the U-Net config file.

LatentSync 1.6 Demo

Original video	Lip-synced video (v1.5)	Lip-synced video (v1.6)
demo1_input.mp4	demo1_output_v1.5.mp4	demo1_output_v1.6.mp4
demo2_input.mp4	demo2_output_v1.5.mp4	demo2_output_v1.6.mp4
demo3_input.mp4	demo3_output_v1.5.mp4	demo3_output_v1.6.mp4
demo4_input.mp4	demo4_output_v1.5.mp4	demo4_output_v1.6.mp4

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LatentSync 1.6

LatentSync 1.6 Demo

FilesExpand file tree

changelog_v1.6.md

Latest commit

History

changelog_v1.6.md

File metadata and controls

LatentSync 1.6

LatentSync 1.6 Demo