A curated list of world model for autonmous driving. Keep updated.
Besides the wonderful papers we list below, we are very happy to announce that our group, NYU Learning Systems Laboratory, recently released a preprint titled: AD-L-JEPA: Self-Supervised Spatial World Models with Joint Embedding Predictive Architecture for Autonomous Driving with LiDAR Data, the first joint-embedding predictive architecture (JEPA) based spatial world models for self-supervised representation learning of autonomous driving. Source code is available at AD-L-JEPA-Release. If this paper inspires you, you may consider cite it via:
@article{zhu2025ad,
title={AD-L-JEPA: Self-Supervised Spatial World Models with Joint Embedding Predictive Architecture for Autonomous Driving with LiDAR Data},
author={Zhu, Haoran and Dong, Zhenyuan and Topollai, Kristi and Choromanska, Anna},
journal={arXiv preprint arXiv:2501.04969},
year={2025}
}Yann Lecun, Danijar Hafner, Chuang Gang, Yilun Du, Nicklas Hansen
- DINO-Foresight: Looking into the Future with DINO
NeurIPS 2025;VFM; Paper, Code - FutureSightDrive: Thinking Visually with Spatio-Temporal CoT for Autonomous Driving
NeurIPS 2025;VLM; Paper, Code - Raw2Drive: Reinforcement Learning with Aligned World Models for End-to-End Autonomous Driving (in CARLA v2)
NeurIPS 2025;End-to-End AD;RL; Paper - Towards foundational LiDAR world models with efficient latent flow matching
NeurIPS 2025;Generative AI;Transfer Learning; Paper, Website - Orbis: Overcoming Challenges of Long-Horizon Prediction in Driving World Models
NeurIPS 2025;Generative AI; Paper, Website - Genesis: Multimodal Driving Scene Generation with Spatio-Temporal and Cross-Modal Consistency
NeurIPS 2025;Generative AI;Multi-Modal; Paper, website, Code to be released
- World4Drive: End-to-End Autonomous Driving via Intention-aware Physical Latent World Model
ICCV 2025; Paper, Code
- GEM: A Generalizable Ego-Vision Multimodal World Model for Fine-Grained Ego-Motion, Object Dynamics, and Scene Composition Control
CVPR 2025;Generative AI; Paper, Code to be released - FUTURIST: Advancing Semantic Future Prediction through Multimodal Visual Sequence Transformers.
CVPR 2025[Paper] [Code] - DIO: Decomposable Implicit 4D Occupancy-Flow World Model
CVPR 2025Paper
- LAW: Enhancing End-to-End Autonomous Driving with Latent World Model
ICLR 2025;End-to-End AD; Paper, Code - PreWorld: Semi-Supervised Vision-Centric 3D Occupancy World Model for Autonomous Driving
ICLR 2025;Occupancy Forecasting;Motion Planning; Paper, Code - AdaWM: Adaptive World Model based Planning for Autonomous Driving
ICLR 2025;RL;Planning; Paper - SSR: Navigation-Guided Sparse Scene Representation for End-to-End Autonomous Driving
ICLR 2025;End-to-End AD; Paper, Code - OccProphet: Pushing Efficiency Frontier of Camera-Only 4D Occupancy Forecasting with Observer-Forecaster-Refiner Framework
ICLR 2025;Occupancy Forecasting; Paper, Code to be released
- DriveDreamer-2: LLM-Enhanced World Models for Diverse Driving Video Generation
AAAI 2025;Generative AI;LLM; Paper, Website, Code - Drive-OccWorld: Driving in the Occupancy World: Vision-Centric 4D Occupancy Forecasting and Planning via World Models for Autonomous Driving
AAAI 2025;Occupancy Forecasting;Planning; Paper, Website, Code
- LOPR: Self-supervised Multi-future Occupancy Forecasting for Autonomous Driving Paper
RSS 2025;
- Back to the Features: DINO as a Foundation for Video World Models Paper
- IntPhys 2: Benchmarking Intuitive Physics Understanding In Complex Synthetic Environments Paper, Code
- Genie Envisioner: A Unified World Foundation Platform for Robotic Manipulation Paper, Website
- Genie 3: A new frontier for world models Website
- DriVerse: Navigation World Model for Driving Simulation via Multimodal Trajectory Prompting and Motion Alignment
arxiv April;Generative AI; Paper, Code - Learning to Drive from a World Mode
arxiv April; Paper - WoTE: End-to-End Driving with Online Trajectory Evaluation via BEV World Model
arxiv April; Paper, Code - AETHER: Geometric-Aware Unified World Modeling
arxiv March; Paper, Website - GAIA-2: A Controllable Multi-View Generative World Model for Autonomous Driving
Generative AI; Paper - Other Vehicle Trajectories Are Also Needed: A Driving World Model Unifies Ego-Other Vehicle Trajectories in Video Latent Space
arxiv March;Generative AI; Paper - $T^3$Former: Temporal Triplane Transformers as Occupancy World Models
arxiv March;Occupancy Forecasting; Paper - InDRiVE: Intrinsic Disagreement-based Reinforcement for Vehicle Exploration through Curiosity-Driven Generalized World Model
arxiv March;RL; Paper - PIWM: Dream to Drive with Predictive Individual World Model
TIV 2025;RL; Paper, Code - MaskGWM: A Generalizable Driving World Model with Video Mask Reconstruction
arxiv;Generative AI; Paper Code - Dream to Drive: Model-Based Vehicle Control Using Analytic World Models
arxiv;Planning; Paper - HERMES: A Unified Self-Driving World Model for Simultaneous 3D Scene Understanding and Generation
arxiv;Generative AI;LLM; Paper, Code to be released - AD-L-JEPA: Self-Supervised Spatial World Models with Joint Embedding Predictive Architecture for Autonomous Driving with LiDAR Data.
arxiv;Pre-training;Self-supervised representation learning; Paper, Code - Cosmos World Foundation Model Platform for Physical AI
arxiv;Foundation Model; Paper, Code
- DrivingDojo Dataset: Advancing Interactive and Knowledge-Enriched Driving World Model
NeurIPS 2024;Dataset; Paper, Website, Code - Vista: A Generalizable Driving World Model with High Fidelity and Versatile Controllability
NeurIPS 2024;from Shanghai AI Lab;Generative AI; Paper, Website, Code
- DriveDreamer: Towards Real-world-driven World Models for Autonomous Driving
ECCV 2024;Generative AI; Paper, Website, Code - Modelling Competitive Behaviors in Autonomous Driving Under Generative World Model
ECCV 2024;RL;Trajectories Simulation; Paper, Code to be released - NeMo: Neural Volumetric World Models for Autonomous Driving
ECCV 2024;End-to-End AD;Motion Planning; Paper - OccWorld: Learning a 3D Occupancy World Model for Autonomous Driving
ECCV 2024;Occupancy Forecasting;Motion Planning; Paper, Code - Think2Drive: Efficient Reinforcement Learning by Thinking with Latent World Model for Autonomous Driving (in CARLA-v2)
ECCV 2024;RL; Paper, Website - FipTR: A Simple yet Effective Transformer Framework for Future Instance Prediction in Autonomous Driving
ECCV 2024;Future Instance Prediction; Paper, Code - DrivingDiffusion: Layout-Guided multi-view driving scene video generation with latent diffusion model
ECCV 2024;Generative AIPaper, Code
- Drive-WM: Driving into the Future: Multiview Visual Forecasting and Planning with World Model for Autonomous Driving
CVPR 2024;Generative AI;Planning; Paper, Website, Code - DriveWorld: 4D Pre-trained Scene Understanding via World Models for Autonomous Driving
CVPR 2024;Pre-training; Paper - Cam4DOcc: Benchmark for Camera-Only 4D Occupancy Forecasting in Autonomous Driving Applications
CVPR 2024;Occupancy Forecasting; Paper, Code - GenAD: Generalized Predictive Model for Autonomous Driving
CVPR 2024;from Shanghai AI LabGenerative AI; Paper, Code - ViDAR: Visual Point Cloud Forecasting enables Scalable Autonomous Driving
CVPR 2024;Pre-training;from Shanghai AI Lab;NuScenes datasetPaper, Code - UnO: Unsupervised Occupancy Fields for Perception and Forecasting
CVPR 2024;Occupancy Forecasting;Pre-training; Paper
- Copilot4D: Learning Unsupervised World Models for Autonomous Driving via Discrete Diffusion
ICLR 2024;Future Point Cloud Prediction;from Waabi; Paper
- Mitigating Covariate Shift in Imitation Learning for Autonomous Vehicles Using Latent Space Generative World Models
ICRA 2024;PlanningPaper
- InfinityDrive: Breaking Time Limits in Driving World Models
arxiv 2024;Generative AI; Paper, Website - DriveDreamer4D: World Models Are Effective Data Machines for 4D Driving Scene Representation
arxiv 2024;Generative AI;4D Simulation; Paper, Website, Code - ReconDreamer: Crafting World Models for Driving Scene Reconstruction via Online Restoration
arxiv 2024;Generative AI;4D Simulation; Paper, Website, Code - 2024-DrivingWorld: Constructing World Model for Autonomous Driving via Video GPT. Paper Project Page Code
- 2024-DOME: Taming Diffusion Model into High-Fidelity Controllable Occupancy World Model. Paper Project Page
- 2024-OccLLaMA: An Occupancy-Language-Action Generative World Model for Autonomous Driving Paper
- 2024-BEVWorld: A Multimodal World Model for Autonomous Driving via Unified BEV Latent Space
arxivPaper - 2024-Planning with Adaptive World Models for Autonomous Driving
arxiv;Planning; Paper - 2024-OccSora: 4D Occupancy Generation Models as World Simulators for Autonomous Driving Paper, Code
- 2023-ADriver-I: A General World Model for Autonomous Driving
arxiv;Generative AI;NuScenes & one private datasetPaper - 2023-GAIA-1: A Generative World Model for Autonomous Driving
arxiv;Generative AI;Wayve's private dataPaper - 2023-Neural World Models for Computer Vision 'PhD Thesis';
from WayvePaper - 2022-Separating the World and Ego Models for Self-Driving
ICLR 2022 workshop on Generalizable Policy Learning in the Physical World;from Yann Lecun's GroupPaper, Code - 2022-SEM2: Enhance Sample Efficiency and Robustness of End-to-end Urban Autonomous Driving via Semantic Masked World Model
NeurIPS 2022 Deep Reinforcement Learning Workshop;RL;CARLA datasetPaper - 2022-MILE: Model-Based Imitation Learning for Urban Driving
NeurIPS 2022;RL;from WayvePaper, Code - 2022-Iso-Dream: Isolating and Leveraging Noncontrollable Visual Dynamics in World Models
NeurIPS 2022Paper, Code - 2021-FIERY: Future Instance Prediction in Bird's-Eye View from Surround Monocular Cameras
ICCV 2019;Future Prediction;from Wayve;NuScenes, Lyft datasetsPaper, Code - 2021-Learning to drive from a world on rails
CVPR 2021 Oral;RLPaper, Project Page, Code - 2019-Model-Predictive Policy Learning with Uncertainty Regularization for Driving in Dense Traffic
ICLR 2019;Future Prediction;from Yann Lecun's GroupPaper, Code
- 2024-1X World Model Challenge
ChallengesLink - 2024-CVPR Workshop, Foundation Models for Autonomous Systems, Challenges, Track 4: Predictive World Model
ChallengesLink
- 2025-A Survey of World Models for Autonomous Driving
arxivPaper - 2024-World Models for Autonomous Driving: An Initial Survey
arxivPaper - 2024-Data-Centric Evolution in Autonomous Driving: A Comprehensive Survey of Big
Data System, Data Mining, and Closed-Loop Technologies
arxivPaper - 2024-Forging Vision Foundation Models for Autonomous Driving: Challenges, Methodologies, and Opportunities
arxivPaper
- 2025-Dreamer 4: Training Agents Inside of Scalable World Models
arxivPaper - 2025-TAWM: Time-Aware World Model for Adaptive Prediction and Control
ICML 2025Paper, Code - 2025-What Has a Foundation Model Found? Using Inductive Bias to Probe for World Models
ICML 2025Paper - 2025-Critiques of World Models Paper
- 2025-DREAMGEN: Unlocking Generalization in Robot Learning through Video World Models
from NvidiaPaper, Code - 2025-V-JEPA 2: Self-Supervised Video Models Enable Understanding, Prediction and Planning
from MetaPaper, Code - 2025-UniVLA: Learning to Act Anywhere with Task-centric Latent Actions
arxiv 2025Paper, Code - 2025-Learning 3D Persistent Embodied World Models
arxiv 2025Paper - 2025-AdaWorld: Learning Adaptable World Models with Latent Actions
ICML 2025Paper - 2025-DreamerV3: Mastering diverse control tasks through world models
NaturePaper, Code - 2025-PhysTwin: Physics-Informed Reconstruction and Simulation of Deformable Objects from Videos Paper, Code
- 2025-Intuitive physics understanding emerges from self-supervised pretraining on natural videos Paper, Code
- 2025-Do generative video models learn physical principles from watching videos? Paper, Code, Website
- 2024-PreLAR: World Model Pre-training with Learnable Action Representation
ECCV 2024;Pretraining;RL; Paper, Code - 2024-Understanding Physical Dynamics with Counterfactual World Modeling
ECCV 2024; Paper, Website, Code - 2024-Genie2: Website
- 2024-WHALE: Towards Generalizable and Scalable World Models for Embodied Decision-making Paper
- 2024-How Far is Video Generation from World Model: A Physical Law Perspective Paper
- 2024-PIVOT-R: Primitive-Driven Waypoint-Aware World Model for Robotic Manipulation
NeurIPS 2024Paper - 2024-RoboDreamer: Learning Compositional World Models for Robot Imagination Paper
- 2024-TD-MPC2: Scalable, Robust World Models for Continuous Control
ICLR 2024Paper - 2024-Hierarchical World Models as Visual Whole-Body Humanoid Controllers Paper
- 2024-Efficient World Models with Time-Aware and Context-Augmented Tokenization
ICML 2024 - 2024-3D-VLA: A 3D Vision-Language-Action Generative World Model
ICML 2024Paper - 2024-Newton from Archetype AI
websiteLink - 2024-MagicTime: Time-lapse Video Generation Models as Metamorphic Simulators
arxivPaper, Code - 2024-IWM: Learning and Leveraging World Models in Visual Representation Learning
arxiv,from Yann Lecun's GroupPaper - 2024-Video as the New Language for Real-World Decision Making
arxiv,DeepmindPaper - 2024-Genie: Generative Interactive Environments
DeepmindPaper, Website - 2024-Sora
OpenAI,Generative AILink, Technical Report - 2024-LWM: World Model on Million-Length Video And Language With RingAttention
arxiv;Generative AIPaper, Code - 2024-WorldDreamer: Towards General World Models for Video Generation via Predicting Masked Tokens
arxiv;Generative AIPaper - 2024-Video prediction models as rewards for reinforcement learning
NeurIPS 2024Paper, Code - 2024-V-JEPA: Revisiting Feature Prediction for Learning Visual Representations from Video
from Yann Lecun's GroupPaper, Code - 2023-STORM: Efficient Stochastic Transformer based World Models for Reinforcement Learning
NeurIPS 2023Paper, Code - 2023-Facing Off World Model Backbones: RNNs, Transformers, and S4
NeurIPS 2023Paper - 2023-I-JEPA: Self-Supervised Learning from Images with a Joint-Embedding Predictive Architecture
CVPR 2023;from Yann Lecun's GroupPaper, Code - 2023-Temporally Consistent Transformers for Video Generation
ICML 2023Paper, Code - 2023-Learning to Model the World with Language
arxivPaper, Code - 2023-Transformers are sample-efficient world models
ICLR 2023;RLPaper, Code - 2023-Gradient-based Planning with World Models
arxiv;from Yann Lecun's Group;Planning; Paper - 2023-World Models via Policy-Guided Trajectory Diffusion
arxiv;RL; Paper - 2023-DreamerV3: Mastering diverse domains through world models
arxiv;RL; Paper, Code - 2022-Daydreamer: World models for physical robot learning
CoRL 2022;RoboticsPaper, Code - 2022-Masked World Models for Visual Control
CoRL 2022;RoboticsPaper, Code - 2022-A Path Towards Autonomous Machine Intelligence
openreview;from Yann Lecun's Group;General Roadmap for World Models; Paper; Slides1, Slides2, Slides3; Videos - 2021-LEXA:Discovering and Achieving Goals via World Models
NeurIPS 2021; Paper, Website & Code - 2021-DreamerV2: Mastering Atari with Discrete World Models
ICLR 2021;RL;from Google & DeepmindPaper, Code - 2020-Dreamer: Dream to Control: Learning Behaviors by Latent Imagination
ICLR 2020Paper, Code - 2019-Learning Latent Dynamics for Planning from Pixels
ICML 2019Paper, Code - 2018-Model-Based Planning with Discrete and Continuous Actions
arxiv;RL, Planning;from Yann Lecun's Group; Paper - 2018-Recurrent world models facilitate policy evolution
NeurIPS 2018; Paper, Code
- 2023-Occupancy Prediction-Guided Neural Planner for Autonomous Driving
ITSC 2023;Planning, Neural Predicted-Guided Planning;Waymo Open Motion datasetPaper
Awesome-World-Model, Awesome-World-Models-for-AD , World models paper list from Shanghai AI lab, Awesome-Papers-World-Models-Autonomous-Driving.