-
Notifications
You must be signed in to change notification settings - Fork 24
Description
We greatly appreciate the inspiring progress in bitwise autoregressive generation models β Infinity and InfinityStar.
During our investigation, we discovered a clear diversity degradation issue in bitwise AR modeling and identified two root causes: (1) the binary classification nature of bitwise modeling, and (2) the overconfident output distributions.
Building on these insights, we propose a simple yet highly effective strategy that improves sampling diversity without sacrificing image or video quality. For Infinity-2B, our method raises LPIPS from 0.5555 to 0.6712, and GenEval Score from 0.72 to 0.76. For Infinity-8B, LPIPS improves from 0.3745 to 0.5510, while GenEval increases from 0.79 to 0.80.
We further validate DiverseAR on the bitwise AR video generation model InfinityStar, where it consistently enhances both motion diversity and content variation while preserving visual fidelity.
π₯ Video Demo:
video_demo.mp4
π ArXiv: https://arxiv.org/abs/2512.02931
π» GitHub: https://github.com/xbyym/DiverseAR
π Project Page: https://diverse-ar.github.io/