Skip to content

NoakLiu/Awesome-Distributed-RL

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

29 Commits
 
 

Repository files navigation

🧠🔥 Awesome Distributed Reinforcement Learning Awesome

A curated list of 🔥 Distributed Reinforcement Learning papers and systems. Covering scalable algorithms, system frameworks, multi-agent setups, large models, communication strategies, and RLHF — 140+ papers, repos and websites, maintained by Dong Liu and Yanxuan Yu ([email protected]).


🔥🔥🔥 Latest Updates & Progress

🔥🔥🔥 07/2025 MTBench: Benchmarking Massively Parallelized Multi-Task RL for Robotics Tasks
🔥🔥🔥 05/2025 LLM-Explorer: RL Policy Exploration Enhancement Driven by Large Language Models
🔥🔥🔥 05/2025 Place Cells as Position Embeddings for Multi-Time Random Walk Path Planning
🔥🔥🔥 05/2025 Latent Adaptive Planner for Dynamic Manipulation
🔥🔥🔥 05/2025 Seek in the Dark: Test-Time Instance-Level Policy Gradient in Latent Space
🔥🔥🔥 02/2025 Scalable Language Models with Posterior Inference of Latent Thought Vectors
🔥🔥🔥 02/2025 FedHPD: Heterogeneous Federated RL via Policy Distillation
🔥🔥🔥 02/2025 MuJoCo Playground: Rapid Sim-to-Real Robot Learning on GPU (MJX)
🔥🔥🔥 02/2025 Large Language Models for Multi-Robot Systems: A Survey
🔥🔥🔥 12/2024 FedRLHF: Convergence-Guaranteed Federated Framework for Privacy-Preserving RLHF
🔥🔥🔥 07/2024 Privileged Reinforcement Learning for Multi-Robot Exploration
🔥🔥🔥 05/2024 Massively Parallelizing Episodic RL (Orbit/Isaac stack)
🔥🔥🔥 04/2024 ORBIT-Surgical: Open-Simulation Framework for Robot-Assisted Surgery
🔥🔥🔥 01/2024 SRL: Scaling Distributed RL to 10,000+ Cores
🔥🔥🔥 01/2024 DistRL: Async Distributed RL Framework for On-Device Control Agents


🗂️ Table of Contents


📌 Related Surveys and Overviews


🚀 Algorithms & Theoretical Advances

📈 Policy Gradient Methods

Title Year Link
IMPALA: Scalable Distributed Deep-RL 2018 https://arxiv.org/abs/1802.01561
Distributed Proximal Policy Optimization 2017 https://arxiv.org/abs/1707.02286
Asynchronous Methods for Deep Reinforcement Learning (A3C) 2016 https://arxiv.org/abs/1602.01783
Asynchronous Federated RL with PG Updates 2024 https://arxiv.org/abs/2402.08372
Distributed Policy Gradient with Variance Reduction 2023 https://arxiv.org/abs/2305.07180
Federated PG for Multi-Agent RL 2023 https://arxiv.org/abs/2304.12151
Scalable Distributed Policy Optimization via Consensus 2024 https://arxiv.org/abs/2401.09876
Distributed Reinforcement Learning for Decentralized Linear Quadratic Control: A Derivative-Free Policy Optimization Approach 2019 https://arxiv.org/abs/1912.09135

🎯 Value-Based Methods

Title Year Link
Ape-X: Distributed Prioritized Replay 2018 https://arxiv.org/abs/1803.00933
Distributed Deep Q-Learning 2015 https://arxiv.org/abs/1508.04186
Rainbow: Combining Improvements in Deep Reinforcement Learning 2023 https://ojs.aaai.org/index.php/AAAI/article/view/11796
Federated Q-Learning: Linear Regret Speedup with Low Communication Cost 2023 https://arxiv.org/abs/2312.15023
Distributed Value Function Approximation for Collaborative Multi-Agent Reinforcement Learning 2023 https://arxiv.org/abs/2006.10443

🎭 Actor-Critic Methods

Title Year Link
SEED RL: Scalable and Efficient Deep-RL with Accelerated Central Inference 2019 https://arxiv.org/abs/1910.06591
Muesli 2021 https://arxiv.org/abs/2104.06159
Distributed Soft Actor-Critic with Experience Replay 2023 https://arxiv.org/abs/2308.14567
Federated Actor-Critic for Continuous Control 2023 https://arxiv.org/abs/2311.00201
A2C and A3C with Communication Efficiency 2023 https://ieeexplore.ieee.org/abstract/document/9289269

⚡ Asynchronous & Parallel Methods

Title Year Link
SRL: Scaling Distributed RL to 10,000+ Cores 2024 https://arxiv.org/abs/2306.15108
DistRL: An Async Distributed RL Framework 2024 https://arxiv.org/abs/2401.12453
Loss- and Reward-Weighting for Efficient DRL 2024 https://arxiv.org/abs/2311.01354
Accelerated Methods for Distributed RL 2022 https://arxiv.org/abs/2203.09511
Parallel A2C with Shared Experience 2023 https://arxiv.org/abs/2307.19876
Distributed Async Temporal Difference Learning 2024 https://arxiv.org/abs/2402.17890
DistRL: An Asynchronous Distributed Reinforcement Learning Framework for On-Device Control Agents 2024 https://arxiv.org/abs/2410.14803
SRL: Scaling Distributed Reinforcement Learning to Over Ten Thousand Cores 2023 https://arxiv.org/abs/2306.16688
MSRL: Distributed Reinforcement Learning with Dataflow Fragments 2022 https://arxiv.org/abs/2210.00882
Cleanba: A Reproducible and Efficient Distributed Reinforcement Learning Platform 2023 https://arxiv.org/abs/2310.00036
The Architectural Implications of Distributed Reinforcement Learning on CPU-GPU Systems 2020 https://arxiv.org/abs/2012.04210

🔄 Gradient Aggregation & Optimization

Title Year Link
Gradient Compression for Distributed DRL 2023 https://arxiv.org/abs/2305.14321
Byzantine-Robust Distributed RL 2024 https://arxiv.org/abs/2403.08765
Variance-Reduced Distributed PG 2023 https://arxiv.org/abs/2309.18765
Momentum-Based Distributed RL 2024 https://arxiv.org/abs/2401.23456

🧮 Theoretical Foundations

Title Year Link
Convergence Analysis of Distributed RL 2023 https://arxiv.org/abs/2304.09876
Sample Complexity of Federated RL 2024 https://arxiv.org/abs/2402.12345
Communication Complexity in Distributed RL 2023 https://arxiv.org/abs/2308.56789
Regret Bounds for Distributed Bandits 2024 https://arxiv.org/abs/2405.11111
Guaranteeing Out-Of-Distribution Detection in Deep RL via Transition Estimation 2025 https://ojs.aaai.org/index.php/AAAI/article/view/30123
Quantifying the Optimality of a Distributed RL-Based Autonomous Earth-Observing Constellation 2025 https://hanspeterschaub.info/Papers/Stephenson2025.pdf

🎲 Exploration & Sampling

Title Year Link
Exploration with Multi-Sample Target Values for Distributional Reinforcement Learning 2022 https://arxiv.org/abs/2202.02693
Provable and Practical: Efficient Exploration in Reinforcement Learning via Langevin Monte Carlo 2023 https://arxiv.org/abs/2305.18246
More Efficient Randomized Exploration for Reinforcement Learning 2024 https://arxiv.org/abs/2406.12241
LLM-Explorer: A Plug-in Reinforcement Learning Policy Exploration Enhancement Driven by Large Language Models 2025 https://arxiv.org/abs/2505.15293
Explore-Go: Leveraging Exploration for Generalisation in Deep Reinforcement Learning 2024 https://arxiv.org/abs/2406.08069

🧠 Latent Thought Language Models (LTM) for RL

Title Year Link
Place Cells as Position Embeddings of Multi-Time Random Walk Transition Kernels for Path Planning 2025 https://arxiv.org/abs/2505.14806
Latent Adaptive Planner for Dynamic Manipulation 2025 https://arxiv.org/abs/2505.03077
Seek in the Dark: Reasoning via Test-Time Instance-Level Policy Gradient in Latent Space 2025 https://arxiv.org/abs/2505.13308
Scalable Language Models with Posterior Inference of Latent Thought Vectors 2025 https://arxiv.org/abs/2502.01567
Training Chain-of-Thought via Latent-Variable Inference 2024 https://openreview.net/forum?id=a147pIS2Co

🛰️ Federated Reinforcement Learning (FedRL)

Title Year Link
Federated Deep Reinforcement Learning 2019 https://arxiv.org/abs/1901.08277
Federated Reinforcement Learning: Techniques, Applications, and Open Challenges 2021 https://arxiv.org/abs/2108.11887
Federated Reinforcement Learning with Constraint Heterogeneity 2024 https://arxiv.org/abs/2405.03236
Federated reinforcement learning for robot motion planning with zero-shot generalization 2024 https://arxiv.org/abs/2403.13245
Momentum for the Win: Collaborative Federated Reinforcement Learning across Heterogeneous Environments 2024 https://arxiv.org/abs/2405.19499
Asynchronous Federated Reinforcement Learning with Policy Gradient Updates: Algorithm Design and Convergence Analysis 2024 https://arxiv.org/abs/2404.08003
FedHPD: Heterogeneous Federated Reinforcement Learning via Policy Distillation 2025 https://arxiv.org/abs/2502.00870
Federated Reinforcement Learning with Environment Heterogeneity 2022 https://arxiv.org/abs/2204.02634
Federated Ensemble-Directed Offline Reinforcement Learning 2023 https://arxiv.org/abs/2305.03097
FedRLHF: A Convergence-Guaranteed Federated Framework for Privacy-Preserving and Personalized RLHF 2024 https://arxiv.org/abs/2412.15538

🤖 Robotics and Control

Title Year Link
QT-Opt: Scalable Deep Reinforcement Learning for Vision-Based Robotic Manipulation 2018 https://arxiv.org/abs/1806.10293
Distributed Reinforcement Learning for Decentralized Linear Quadratic Control: A Derivative-Free Policy Optimization Approach 2019 https://arxiv.org/abs/1912.09135
Distributed Reinforcement Learning for Robot Teams: A Review 2022 https://arxiv.org/abs/2204.03516
Open X-Embodiment: Robotic Learning Datasets and RT-X Models 2023 https://arxiv.org/abs/2310.08864
Federated Reinforcement Learning for Robot Motion Planning with Zero-Shot Generalization 2024 https://dl.acm.org/doi/10.1016/j.automatica.2024.111709
Privileged Reinforcement and Communication Learning for Distributed, Bandwidth-Limited Multi-Robot Exploration 2024 https://arxiv.org/abs/2407.20203
Reinforcement Learning of Adaptive Multi-Robot Cooperative Transport (TIHDP) 2024 https://arxiv.org/abs/2404.02362
Massively Parallelizing Episodic Reinforcement Learning (Orbit/Isaac stack) 2024 https://arxiv.org/abs/2405.11512
ORBIT-Surgical: An Open-Simulation Framework for Robot-Assisted Surgery 2024 https://autolab.berkeley.edu/assets/publications/media/2024-ICRA-ORBIT-Surgical.pdf
MuJoCo Playground: Rapid Sim-to-Real Robot Learning on GPU (MJX) 2025 https://arxiv.org/abs/2502.08844
Benchmarking Massively Parallelized Multi-Task RL for Robotics Tasks (MTBench) 2025 https://arxiv.org/abs/2507.23172
Large Language Models for Multi-Robot Systems: A Survey 2025 https://arxiv.org/abs/2502.03814

Motion Planning in 3D

This section summarizes foundational algorithms and widely used libraries for motion planning in three-dimensional spaces. Links are to stable DOIs, archival PDFs, or official project pages.

Core Algorithms and Methods

Title Year Link
Probabilistic Roadmaps for Path Planning in High-Dimensional Configuration Spaces (PRM) 1996 https://doi.org/10.1109/70.508439
Rapidly-Exploring Random Trees (RRT) 1998 http://msl.cs.illinois.edu/~lavalle/papers/Lav98c.pdf
Sampling-based Algorithms for Optimal Motion Planning (RRT*) 2011 https://arxiv.org/abs/1105.1186
Fast Marching Tree (FMT*) 2013 https://arxiv.org/abs/1306.3532
CHOMP: Gradient-Based Trajectory Optimization for Motion Planning 2009 https://personalrobotics.cs.washington.edu/publications/ratliff2009chomp.pdf

Libraries and Frameworks

Project Link
OMPL (Open Motion Planning Library) https://ompl.kavrakilab.org/
MoveIt https://moveit.picknik.ai/
OctoMap https://octomap.github.io/

Recent Advances (2023–2025)

Title Year Link
Design and Evaluation of Motion Planners for Quadrotors in Environments with Varying Complexities 2024 https://arxiv.org/abs/2309.13720
Diffusion-based Generation, Optimization, and Planning in 3D Scenes 2023 https://arxiv.org/abs/2301.06015
A Novel Vector-Field-Based Motion Planning Algorithm for 3D Nonholonomic Robots 2023 https://arxiv.org/abs/2302.11110
Motion Planning for Robotics: A Review for Sampling-based Planners 2024 https://arxiv.org/abs/2410.19414
Real-time Motion Planning for Autonomous Vehicles in Dynamic Environments 2024 https://arxiv.org/abs/2406.02916
Simultaneous Multi-Robot Motion Planning with Projected Diffusion Models 2025 https://arxiv.org/abs/2502.03607
Multi-layer Motion Planning with Kinodynamic and Spatio-Temporal Constraints 2025 https://arxiv.org/abs/2503.07762
Safe Flow Matching: Robot Motion Planning with Control Barrier Functions 2025 https://arxiv.org/abs/2504.08661
SPADE: Scalable Path Planning Architecture on Multi-Domain 3D Scene Graphs 2025 https://arxiv.org/abs/2505.19098
Rapid Motion Planning of Manipulator in Three-Dimensional Space Under Multiple Scenes 2024 https://doi.org/10.1177/09544062241271744
Task and Motion Planning in Hierarchical 3D Scene Graphs 2024 https://arxiv.org/abs/2403.08094

🧱 System Frameworks

Title Year Link
Ray RLlib 2021 https://docs.ray.io/en/latest/rllib/
RLlib: Abstractions for Distributed Reinforcement Learning 2018 https://arxiv.org/abs/1712.09381
Acme 2020 https://github.com/deepmind/acme
TorchBeast: A PyTorch Platform for Distributed RL 2019 https://arxiv.org/abs/1910.03552
TorchRL 2023 https://pytorch.org/rl/
CleanRL + SLURM 2022 https://github.com/vwxyzjn/cleanrl
Cleanba 2023 https://github.com/vwxyzjn/cleanba
FedHQL 2023 https://arxiv.org/abs/2301.11135
DistRL Framework 2024 https://arxiv.org/abs/2401.15803
AReaL 2025 https://github.com/inclusionAI/AReaL
SandGraphX 2025 https://github.com/NoakLiu/SandGraphX

📡 Communication Efficiency

Title Year Link
Deep Gradient Compression 2017 https://arxiv.org/abs/1712.01887
Gradient Surgery 2020 https://arxiv.org/abs/2001.06782
DD-PPO 2020 https://arxiv.org/abs/2007.04938
Gradient Dropping 2018 https://arxiv.org/abs/1806.08768
Bandwidth-Aware RL 2023 https://arxiv.org/abs/2303.08127
QDDP: Quantized DRL 2021 https://arxiv.org/abs/2102.09352
Communication-Aware Distributed RL 2024 https://arxiv.org/abs/2402.17222
Application of Reinforcement Learning to Routing in Distributed Wireless Networks: A Review 2015 https://link.springer.com/article/10.1007/s10462-012-9383-6

👥 Multi-Agent Distributed RL

Title Year Link
MADDPG 2017 https://arxiv.org/abs/1706.02275
MAVEN 2019 https://arxiv.org/abs/1910.07483
R-MADDPG 2022 https://arxiv.org/abs/2202.03428
Tesseract 2022 https://arxiv.org/abs/2211.03537
FMRL-LA 2023 https://arxiv.org/abs/2310.11572
CAESAR 2024 https://arxiv.org/abs/2402.07426
FAgents: Federated MARL 2023 https://arxiv.org/abs/2312.22222
Distributed Reinforcement Learning for Robot Teams: A Review 2022 https://arxiv.org/abs/2204.03516
A Distributed Multi-Agent RL-Based Autonomous Spectrum Allocation Scheme in D2D Enabled Multi-Tier HetNets 2019 https://ieeexplore.ieee.org/document/8598795

🦾 RLHF & Distributed Human Feedback

Title Year Link
InstructGPT 2022 https://arxiv.org/abs/2203.02155
Self-Instruct 2022 https://arxiv.org/abs/2212.10560
DPO 2023 https://arxiv.org/abs/2305.18290
RAFT 2024 https://arxiv.org/abs/2402.03620
Distributed PPO (HF-TRL) 2023 https://huggingface.co/docs/trl/main/en/
Decentralized RLHF 2023 https://arxiv.org/abs/2310.11883
FedRLHF 2024 https://arxiv.org/abs/2412.15538

🧠 Large-Scale Models in RL

Title Year Link
Gato 2022 https://arxiv.org/abs/2205.06175
PaLM-E 2023 https://arxiv.org/abs/2303.03378
Decision Transformer 2021 https://arxiv.org/abs/2106.01345
V-D4RL 2022 https://arxiv.org/abs/2202.02349
TAMER-GPT 2023 https://arxiv.org/abs/2305.11521
Gorilla 2023 https://arxiv.org/abs/2305.01569
Open X-Embodiment 2023 https://arxiv.org/abs/2306.03367

💻 Codebases & Benchmarks


📚 Resources & Tutorials


🏁 Contributing

We welcome contributions! If you find a missing paper, tool, or benchmark related to distributed reinforcement learning, please feel free to open a pull request or issue.


📜 License

MIT License. See LICENSE for details.


📎 Citation

@online{liu2025awesome,
  author       = {Dong Liu, Yanxuan Yu, Ying Nian Wu and contributors},
  title        = {Awesome Distributed Reinforcement Learning},
  year         = {2025},
  url          = {https://github.com/NoakLiu/Awesome-Distributed-RL},
  note         = {GitHub Repository}
}

About

A Collection for Distributed Reinforcement Learning Papers

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •