Many people have reported that the teeth and lips generated by LatentSync 1.5 are blurry. To address this issue, we trained LatentSync 1.6 on 512
Notably, we did not make any changes to the model structure or training strategy; the only modification was upgrading the training dataset to 512 resolution parameter in the U-Net config file.
| Original video | Lip-synced video (v1.5) | Lip-synced video (v1.6) |
demo1_input.mp4 |
demo1_output_v1.5.mp4 |
demo1_output_v1.6.mp4 |
demo2_input.mp4 |
demo2_output_v1.5.mp4 |
demo2_output_v1.6.mp4 |
demo3_input.mp4 |
demo3_output_v1.5.mp4 |
demo3_output_v1.6.mp4 |
demo4_input.mp4 |
demo4_output_v1.5.mp4 |
demo4_output_v1.6.mp4 |