Skip to content
Change the repository type filter

All

    Repositories list

    • EditReward: A Human-Aligned Reward Model for Instruction-Guided Image Editing [ICLR 2026]
      Python
      412060Updated Feb 6, 2026Feb 6, 2026
    • Context-Forcing

      Public
      Consistent Autoregressive Video Generation with Long Context
      14210Updated Feb 6, 2026Feb 6, 2026
    • VisualWebInstruct

      Public
      The official repo for "VisualWebInstruct: Scaling up Multimodal Instruction Data through Web Search" [EMNLP25]
      Python
      13800Updated Feb 1, 2026Feb 1, 2026
    • ImagenWorld

      Public
      Stress-Testing Image Generation Models with Explainable Human Evaluation on Open-ended Real-World Tasks [ICLR 2026]
      Python
      02400Updated Jan 29, 2026Jan 29, 2026
    • VisCoder2

      Public
      The official code of "VisCoder2: Building Multi-Language Visualization Coding Agents" [ICLR26]
      Python
      0900Updated Jan 28, 2026Jan 28, 2026
    • VLM2Vec

      Public
      This repo contains the code for "VLM2Vec: Training Vision-Language Models for Massive Multimodal Embedding Tasks" [ICLR 2025]
      Python
      50565240Updated Jan 27, 2026Jan 27, 2026
    • BrowserAgent

      Public
      BrowserAgent: Building Web Agents with Human-Inspired Web Browsing Actions [TMLR2025]
      Python
      42900Updated Jan 13, 2026Jan 13, 2026
    • StructEval

      Public
      Evaluating LLMs' abilities to generate structural output [TMLR2025]
      Python
      52211Updated Jan 13, 2026Jan 13, 2026
    • verl-tool

      Public
      A version of verl to support diverse tool use
      Python
      7386251Updated Jan 6, 2026Jan 6, 2026
    • Mantis

      Public
      Official code for Paper "Mantis: Multi-Image Instruction Tuning" [TMLR 2024 Best Paper]
      Python
      2223941Updated Jan 3, 2026Jan 3, 2026
    • Automatic Metric for Evaluating Generated Videos
      Python
      13250Updated Dec 8, 2025Dec 8, 2025
    • official repo for "VideoScore: Building Automatic Metrics to Simulate Fine-grained Human Feedback for Video Generation" [EMNLP2024]
      Python
      511130Updated Dec 4, 2025Dec 4, 2025
    • ImagenHub

      Public
      A one-stop library to standardize the inference and evaluation of all the conditional image generation models. [ICLR 2024]
      Python
      1917820Updated Dec 2, 2025Dec 2, 2025
    • General Reasoner: Advancing LLM Reasoning Across All Domains [NeurIPS25]
      Python
      1221720Updated Nov 27, 2025Nov 27, 2025
    • MMLU-Pro

      Public
      The code and data for "MMLU-Pro: A More Robust and Challenging Multi-Task Language Understanding Benchmark" [NeurIPS 2024]
      Python
      5433580Updated Nov 22, 2025Nov 22, 2025
    • Pixel-Reasoner

      Public
      Pixel-Level Reasoning Model trained with RL [NeuIPS25]
      Python
      1227640Updated Nov 6, 2025Nov 6, 2025
    • QuickCodec

      Public
      A More Efficient Video Codec
      Cython
      0800Updated Nov 2, 2025Nov 2, 2025
    • Quick Long Video Understanding [TMLR2025]
      Python
      67520Updated Oct 27, 2025Oct 27, 2025
    • Hierarchical-Reasoner

      Public
      Emergent Hierarchical Reasoning in LLMs/VLMs through Reinforcement Learning
      Python
      36100Updated Oct 24, 2025Oct 24, 2025
    • Training Coder Models with Critique Reinforcement Learning
      Python
      01300Updated Oct 1, 2025Oct 1, 2025
    • VideoEval-Pro

      Public
      More reliable Video Understanding Evaluation
      Python
      01400Updated Sep 23, 2025Sep 23, 2025
    • VisCoder

      Public
      The official code of "VisCoder: Fine-Tuning LLMs for Executable Python Visualization Code Generation" [EMNLP25]
      Python
      21700Updated Sep 21, 2025Sep 21, 2025
    • The official code of "PixelWorld: Towards Perceiving Everything as Pixels" [TMLR25]
      Python
      11610Updated Sep 12, 2025Sep 12, 2025
    • The official repo for “Unleashing the Reasoning Potential of Pre-trained LLMs by Critique Fine-Tuning on One Problem” [EMNLP25]
      Python
      53400Updated Sep 1, 2025Sep 1, 2025
    • ABC

      Public
      ABC: Achieving Better Control of Multimodal Embeddings using VLMs [TMLR2025]
      Python
      21900Updated Aug 21, 2025Aug 21, 2025
    • Vamba

      Public
      Code for the paper "Vamba: Understanding Hour-Long Videos with Hybrid Mamba-Transformers" [ICCV 2025]
      Python
      1110020Updated Jul 28, 2025Jul 28, 2025
    • Official Repo for "TheoremExplainAgent: Towards Video-based Multimodal Explanations for LLM Theorem Understanding" [ACL 2025 oral]
      Python
      1931.5k91Updated Jul 27, 2025Jul 27, 2025
    • Code for "Critique Fine-Tuning: Learning to Critique is More Effective than Learning to Imitate" [COLM 2025]
      Python
      818030Updated Jul 8, 2025Jul 8, 2025
    • ScholarCopilot

      Public
      ScholarCopilot: Training Large Language Models for Academic Writing with Accurate Citations [COLM 2025]
      Python
      3024900Updated Jul 8, 2025Jul 8, 2025
    • This repo contains the code for "MEGA-Bench Scaling Multimodal Evaluation to over 500 Real-World Tasks" [ICLR 2025]
      Python
      77850Updated Jul 1, 2025Jul 1, 2025