Skip to content

HaoranZhuExplorer/World-Models-Autonomous-Driving-Survey

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

218 Commits
 
 

Repository files navigation

World-Models-Autonomous-Driving-Latest-Survey

A curated list of world model for autonmous driving. Keep updated.

Announcement

Besides the wonderful papers we list below, we are very happy to announce that our group, NYU Learning Systems Laboratory, recently released a preprint titled: AD-L-JEPA: Self-Supervised Spatial World Models with Joint Embedding Predictive Architecture for Autonomous Driving with LiDAR Data, the first joint-embedding predictive architecture (JEPA) based spatial world models for self-supervised representation learning of autonomous driving. Source code is available at AD-L-JEPA-Release. If this paper inspires you, you may consider cite it via:

@article{zhu2025ad,
  title={AD-L-JEPA: Self-Supervised Spatial World Models with Joint Embedding Predictive Architecture for Autonomous Driving with LiDAR Data},
  author={Zhu, Haoran and Dong, Zhenyuan and Topollai, Kristi and Choromanska, Anna},
  journal={arXiv preprint arXiv:2501.04969},
  year={2025}
}

Leading Researchers

Yann Lecun, Danijar Hafner, Chuang Gang, Yilun Du, Nicklas Hansen

Papers

2025

NeurIPS 2025

  • DINO-Foresight: Looking into the Future with DINO NeurIPS 2025; VFM; Paper, Code
  • FutureSightDrive: Thinking Visually with Spatio-Temporal CoT for Autonomous Driving NeurIPS 2025; VLM; Paper, Code
  • Raw2Drive: Reinforcement Learning with Aligned World Models for End-to-End Autonomous Driving (in CARLA v2) NeurIPS 2025; End-to-End AD; RL; Paper
  • Towards foundational LiDAR world models with efficient latent flow matching NeurIPS 2025; Generative AI; Transfer Learning; Paper, Website
  • Orbis: Overcoming Challenges of Long-Horizon Prediction in Driving World Models NeurIPS 2025; Generative AI; Paper, Website
  • Genesis: Multimodal Driving Scene Generation with Spatio-Temporal and Cross-Modal Consistency NeurIPS 2025; Generative AI; Multi-Modal; Paper, website, Code to be released

ICCV 2025

  • World4Drive: End-to-End Autonomous Driving via Intention-aware Physical Latent World Model ICCV 2025; Paper, Code

ICML 2025

  • DriveGPT: Scaling Autoregressive Behavior Models for Driving ICML 2025; Paper Demo

CVPR 2025

  • GEM: A Generalizable Ego-Vision Multimodal World Model for Fine-Grained Ego-Motion, Object Dynamics, and Scene Composition Control CVPR 2025; Generative AI; Paper, Code to be released
  • FUTURIST: Advancing Semantic Future Prediction through Multimodal Visual Sequence Transformers. CVPR 2025 [Paper] [Code]
  • DIO: Decomposable Implicit 4D Occupancy-Flow World Model CVPR 2025 Paper

ICLR 2025

  • LAW: Enhancing End-to-End Autonomous Driving with Latent World Model ICLR 2025; End-to-End AD; Paper, Code
  • PreWorld: Semi-Supervised Vision-Centric 3D Occupancy World Model for Autonomous Driving ICLR 2025; Occupancy Forecasting ; Motion Planning ; Paper, Code
  • AdaWM: Adaptive World Model based Planning for Autonomous Driving ICLR 2025; RL; Planning; Paper
  • SSR: Navigation-Guided Sparse Scene Representation for End-to-End Autonomous Driving ICLR 2025; End-to-End AD; Paper, Code
  • OccProphet: Pushing Efficiency Frontier of Camera-Only 4D Occupancy Forecasting with Observer-Forecaster-Refiner Framework ICLR 2025; Occupancy Forecasting ; Paper, Code to be released

AAAI 2025

  • DriveDreamer-2: LLM-Enhanced World Models for Diverse Driving Video Generation AAAI 2025; Generative AI; LLM; Paper, Website, Code
  • Drive-OccWorld: Driving in the Occupancy World: Vision-Centric 4D Occupancy Forecasting and Planning via World Models for Autonomous Driving AAAI 2025; Occupancy Forecasting ; Planning ; Paper, Website, Code

RSS 2025

  • LOPR: Self-supervised Multi-future Occupancy Forecasting for Autonomous Driving Paper RSS 2025;

Others

  • Back to the Features: DINO as a Foundation for Video World Models Paper
  • IntPhys 2: Benchmarking Intuitive Physics Understanding In Complex Synthetic Environments Paper, Code
  • Genie Envisioner: A Unified World Foundation Platform for Robotic Manipulation Paper, Website
  • Genie 3: A new frontier for world models Website
  • DriVerse: Navigation World Model for Driving Simulation via Multimodal Trajectory Prompting and Motion Alignment arxiv April; Generative AI; Paper, Code
  • Learning to Drive from a World Mode arxiv April; Paper
  • WoTE: End-to-End Driving with Online Trajectory Evaluation via BEV World Model arxiv April; Paper, Code
  • AETHER: Geometric-Aware Unified World Modeling arxiv March; Paper, Website
  • GAIA-2: A Controllable Multi-View Generative World Model for Autonomous Driving Generative AI; Paper
  • Other Vehicle Trajectories Are Also Needed: A Driving World Model Unifies Ego-Other Vehicle Trajectories in Video Latent Space arxiv March; Generative AI; Paper
  • $T^3$Former: Temporal Triplane Transformers as Occupancy World Models arxiv March; Occupancy Forecasting; Paper
  • InDRiVE: Intrinsic Disagreement-based Reinforcement for Vehicle Exploration through Curiosity-Driven Generalized World Model arxiv March; RL; Paper
  • PIWM: Dream to Drive with Predictive Individual World Model TIV 2025; RL; Paper, Code
  • MaskGWM: A Generalizable Driving World Model with Video Mask Reconstruction arxiv; Generative AI; Paper Code
  • Dream to Drive: Model-Based Vehicle Control Using Analytic World Models arxiv; Planning; Paper
  • HERMES: A Unified Self-Driving World Model for Simultaneous 3D Scene Understanding and Generation arxiv; Generative AI; LLM; Paper, Code to be released
  • AD-L-JEPA: Self-Supervised Spatial World Models with Joint Embedding Predictive Architecture for Autonomous Driving with LiDAR Data. arxiv; Pre-training; Self-supervised representation learning; Paper, Code
  • Cosmos World Foundation Model Platform for Physical AI arxiv; Foundation Model; Paper, Code

2024

NeurIPS 2024

  • DrivingDojo Dataset: Advancing Interactive and Knowledge-Enriched Driving World Model NeurIPS 2024; Dataset; Paper, Website, Code
  • Vista: A Generalizable Driving World Model with High Fidelity and Versatile Controllability NeurIPS 2024; from Shanghai AI Lab; Generative AI; Paper, Website, Code

ECCV 2024

  • DriveDreamer: Towards Real-world-driven World Models for Autonomous Driving ECCV 2024; Generative AI; Paper, Website, Code
  • Modelling Competitive Behaviors in Autonomous Driving Under Generative World Model ECCV 2024; RL; Trajectories Simulation; Paper, Code to be released
  • NeMo: Neural Volumetric World Models for Autonomous Driving ECCV 2024; End-to-End AD; Motion Planning ; Paper
  • OccWorld: Learning a 3D Occupancy World Model for Autonomous Driving ECCV 2024; Occupancy Forecasting; Motion Planning; Paper, Code
  • Think2Drive: Efficient Reinforcement Learning by Thinking with Latent World Model for Autonomous Driving (in CARLA-v2) ECCV 2024; RL; Paper, Website
  • FipTR: A Simple yet Effective Transformer Framework for Future Instance Prediction in Autonomous Driving ECCV 2024; Future Instance Prediction; Paper, Code
  • DrivingDiffusion: Layout-Guided multi-view driving scene video generation with latent diffusion model ECCV 2024; Generative AI Paper, Code

CVPR 2024

  • Drive-WM: Driving into the Future: Multiview Visual Forecasting and Planning with World Model for Autonomous Driving CVPR 2024; Generative AI; Planning; Paper, Website, Code
  • DriveWorld: 4D Pre-trained Scene Understanding via World Models for Autonomous Driving CVPR 2024; Pre-training; Paper
  • Cam4DOcc: Benchmark for Camera-Only 4D Occupancy Forecasting in Autonomous Driving Applications CVPR 2024; Occupancy Forecasting ; Paper, Code
  • GenAD: Generalized Predictive Model for Autonomous Driving CVPR 2024; from Shanghai AI Lab Generative AI; Paper, Code
  • ViDAR: Visual Point Cloud Forecasting enables Scalable Autonomous Driving CVPR 2024; Pre-training; from Shanghai AI Lab; NuScenes dataset Paper, Code
  • UnO: Unsupervised Occupancy Fields for Perception and Forecasting CVPR 2024; Occupancy Forecasting; Pre-training; Paper

ICLR 2024

  • Copilot4D: Learning Unsupervised World Models for Autonomous Driving via Discrete Diffusion ICLR 2024; Future Point Cloud Prediction; from Waabi; Paper

ICRA 2024

  • Mitigating Covariate Shift in Imitation Learning for Autonomous Vehicles Using Latent Space Generative World Models ICRA 2024; Planning Paper

Others

  • InfinityDrive: Breaking Time Limits in Driving World Models arxiv 2024; Generative AI; Paper, Website
  • DriveDreamer4D: World Models Are Effective Data Machines for 4D Driving Scene Representation arxiv 2024; Generative AI; 4D Simulation; Paper, Website, Code
  • ReconDreamer: Crafting World Models for Driving Scene Reconstruction via Online Restoration arxiv 2024; Generative AI; 4D Simulation; Paper, Website, Code
  • 2024-DrivingWorld: Constructing World Model for Autonomous Driving via Video GPT. Paper Project Page Code
  • 2024-DOME: Taming Diffusion Model into High-Fidelity Controllable Occupancy World Model. Paper Project Page
  • 2024-OccLLaMA: An Occupancy-Language-Action Generative World Model for Autonomous Driving Paper
  • 2024-BEVWorld: A Multimodal World Model for Autonomous Driving via Unified BEV Latent Space arxiv Paper
  • 2024-Planning with Adaptive World Models for Autonomous Driving arxiv; Planning; Paper
  • 2024-OccSora: 4D Occupancy Generation Models as World Simulators for Autonomous Driving Paper, Code

Before 2023

  • 2023-ADriver-I: A General World Model for Autonomous Driving arxiv; Generative AI; NuScenes & one private dataset Paper
  • 2023-GAIA-1: A Generative World Model for Autonomous Driving arxiv; Generative AI; Wayve's private data Paper
  • 2023-Neural World Models for Computer Vision 'PhD Thesis'; from Wayve Paper
  • 2022-Separating the World and Ego Models for Self-Driving ICLR 2022 workshop on Generalizable Policy Learning in the Physical World; from Yann Lecun's Group Paper, Code
  • 2022-SEM2: Enhance Sample Efficiency and Robustness of End-to-end Urban Autonomous Driving via Semantic Masked World Model NeurIPS 2022 Deep Reinforcement Learning Workshop; RL; CARLA dataset Paper
  • 2022-MILE: Model-Based Imitation Learning for Urban Driving NeurIPS 2022; RL; from Wayve Paper, Code
  • 2022-Iso-Dream: Isolating and Leveraging Noncontrollable Visual Dynamics in World Models NeurIPS 2022 Paper, Code
  • 2021-FIERY: Future Instance Prediction in Bird's-Eye View from Surround Monocular Cameras ICCV 2019; Future Prediction; from Wayve; NuScenes, Lyft datasets Paper, Code
  • 2021-Learning to drive from a world on rails CVPR 2021 Oral; RL Paper, Project Page, Code
  • 2019-Model-Predictive Policy Learning with Uncertainty Regularization for Driving in Dense Traffic ICLR 2019; Future Prediction; from Yann Lecun's Group Paper, Code

Workshops/Challenges

  • 2024-1X World Model Challenge Challenges Link
  • 2024-CVPR Workshop, Foundation Models for Autonomous Systems, Challenges, Track 4: Predictive World Model Challenges Link

Tutorials/Talks/

  • 2023 from Wayve; Video
  • 2022-Neural World Models for Autonomous Driving Video

Surveys that Contain World Models for AD

  • 2025-A Survey of World Models for Autonomous Driving arxiv Paper
  • 2024-World Models for Autonomous Driving: An Initial Survey arxiv Paper
  • 2024-Data-Centric Evolution in Autonomous Driving: A Comprehensive Survey of Big Data System, Data Mining, and Closed-Loop Technologies arxiv Paper
  • 2024-Forging Vision Foundation Models for Autonomous Driving: Challenges, Methodologies, and Opportunities arxiv Paper

Other General World Model Papers

  • 2025-Dreamer 4: Training Agents Inside of Scalable World Models arxiv Paper
  • 2025-TAWM: Time-Aware World Model for Adaptive Prediction and Control ICML 2025 Paper, Code
  • 2025-What Has a Foundation Model Found? Using Inductive Bias to Probe for World Models ICML 2025 Paper
  • 2025-Critiques of World Models Paper
  • 2025-DREAMGEN: Unlocking Generalization in Robot Learning through Video World Models from Nvidia Paper, Code
  • 2025-V-JEPA 2: Self-Supervised Video Models Enable Understanding, Prediction and Planning from Meta Paper, Code
  • 2025-UniVLA: Learning to Act Anywhere with Task-centric Latent Actions arxiv 2025 Paper, Code
  • 2025-Learning 3D Persistent Embodied World Models arxiv 2025 Paper
  • 2025-AdaWorld: Learning Adaptable World Models with Latent Actions ICML 2025 Paper
  • 2025-DreamerV3: Mastering diverse control tasks through world models Nature Paper, Code
  • 2025-PhysTwin: Physics-Informed Reconstruction and Simulation of Deformable Objects from Videos Paper, Code
  • 2025-Intuitive physics understanding emerges from self-supervised pretraining on natural videos Paper, Code
  • 2025-Do generative video models learn physical principles from watching videos? Paper, Code, Website
  • 2024-PreLAR: World Model Pre-training with Learnable Action Representation ECCV 2024; Pretraining; RL; Paper, Code
  • 2024-Understanding Physical Dynamics with Counterfactual World Modeling ECCV 2024; Paper, Website, Code
  • 2024-Genie2: Website
  • 2024-WHALE: Towards Generalizable and Scalable World Models for Embodied Decision-making Paper
  • 2024-How Far is Video Generation from World Model: A Physical Law Perspective Paper
  • 2024-PIVOT-R: Primitive-Driven Waypoint-Aware World Model for Robotic Manipulation NeurIPS 2024 Paper
  • 2024-RoboDreamer: Learning Compositional World Models for Robot Imagination Paper
  • 2024-TD-MPC2: Scalable, Robust World Models for Continuous Control ICLR 2024 Paper
  • 2024-Hierarchical World Models as Visual Whole-Body Humanoid Controllers Paper
  • 2024-Efficient World Models with Time-Aware and Context-Augmented Tokenization ICML 2024
  • 2024-3D-VLA: A 3D Vision-Language-Action Generative World Model ICML 2024 Paper
  • 2024-Newton from Archetype AI website Link
  • 2024-MagicTime: Time-lapse Video Generation Models as Metamorphic Simulators arxiv Paper, Code
  • 2024-IWM: Learning and Leveraging World Models in Visual Representation Learning arxiv, from Yann Lecun's Group Paper
  • 2024-Video as the New Language for Real-World Decision Making arxiv, Deepmind Paper
  • 2024-Genie: Generative Interactive Environments Deepmind Paper, Website
  • 2024-Sora OpenAI, Generative AI Link, Technical Report
  • 2024-LWM: World Model on Million-Length Video And Language With RingAttention arxiv; Generative AI Paper, Code
  • 2024-WorldDreamer: Towards General World Models for Video Generation via Predicting Masked Tokens arxiv; Generative AI Paper
  • 2024-Video prediction models as rewards for reinforcement learning NeurIPS 2024 Paper, Code
  • 2024-V-JEPA: Revisiting Feature Prediction for Learning Visual Representations from Video from Yann Lecun's Group Paper, Code
  • 2023-STORM: Efficient Stochastic Transformer based World Models for Reinforcement Learning NeurIPS 2023 Paper, Code
  • 2023-Facing Off World Model Backbones: RNNs, Transformers, and S4 NeurIPS 2023 Paper
  • 2023-I-JEPA: Self-Supervised Learning from Images with a Joint-Embedding Predictive Architecture CVPR 2023; from Yann Lecun's Group Paper, Code
  • 2023-Temporally Consistent Transformers for Video Generation ICML 2023 Paper, Code
  • 2023-Learning to Model the World with Language arxiv Paper, Code
  • 2023-Transformers are sample-efficient world models ICLR 2023;RL Paper, Code
  • 2023-Gradient-based Planning with World Models arxiv; from Yann Lecun's Group; Planning; Paper
  • 2023-World Models via Policy-Guided Trajectory Diffusion arxiv; RL; Paper
  • 2023-DreamerV3: Mastering diverse domains through world models arxiv;RL; Paper, Code
  • 2022-Daydreamer: World models for physical robot learning CoRL 2022; Robotics Paper, Code
  • 2022-Masked World Models for Visual Control CoRL 2022; Robotics Paper, Code
  • 2022-A Path Towards Autonomous Machine Intelligence openreview; from Yann Lecun's Group; General Roadmap for World Models; Paper; Slides1, Slides2, Slides3; Videos
  • 2021-LEXA:Discovering and Achieving Goals via World Models NeurIPS 2021; Paper, Website & Code
  • 2021-DreamerV2: Mastering Atari with Discrete World Models ICLR 2021; RL; from Google & Deepmind Paper, Code
  • 2020-Dreamer: Dream to Control: Learning Behaviors by Latent Imagination ICLR 2020 Paper, Code
  • 2019-Learning Latent Dynamics for Planning from Pixels ICML 2019 Paper, Code
  • 2018-Model-Based Planning with Discrete and Continuous Actions arxiv; RL, Planning; from Yann Lecun's Group; Paper
  • 2018-Recurrent world models facilitate policy evolution NeurIPS 2018; Paper, Code

Other Related Papers

  • 2023-Occupancy Prediction-Guided Neural Planner for Autonomous Driving ITSC 2023; Planning, Neural Predicted-Guided Planning; Waymo Open Motion dataset Paper

Other Related Repos

Awesome-World-Model, Awesome-World-Models-for-AD , World models paper list from Shanghai AI lab, Awesome-Papers-World-Models-Autonomous-Driving.

About

A curated list of world models for autonomous driving.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors