Resources and paper list for "Thinking with Images for LVLMs". This repository accompanies our survey on how LVLMs can leverage visual information for complex reasoning, planning, and generation.
-
Updated
Mar 9, 2026
Resources and paper list for "Thinking with Images for LVLMs". This repository accompanies our survey on how LVLMs can leverage visual information for complex reasoning, planning, and generation.
[ICLR 2026]🌴 ARES is an open-source framework for adaptive multimodal reasoning, featuring a two-stage pipeline—Adaptive Cold-Start and Entropy-Shaped Policy Optimization—to balance reasoning depth and efficiency.
Add a description, image, and links to the multimodal-reasoning-visual-reasoning topic page so that developers can more easily learn about it.
To associate your repository with the multimodal-reasoning-visual-reasoning topic, visit your repo's landing page and select "manage topics."