Skip to content

Commit ff55dc8

Browse files
authored
Update README with new research papers
Added new research papers and their details related to Monte Carlo Tree Search and Reinforcement Learning.
1 parent 247b621 commit ff55dc8

File tree

1 file changed

+66
-1
lines changed

1 file changed

+66
-1
lines changed

README.md

Lines changed: 66 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -495,6 +495,32 @@ Here is a collection of research papers about **Monte Carlo Tree Search**.
495495
- Key: chemical retrosynthetic planning, neural-based A*-like algorithm, ANDOR tree
496496
- ExpEnv: USPTO datasets
497497
- [Code](https://github.com/binghong-ml/retro_star)
498+
499+
500+
501+
- [Monte Carlo Tree Diffusion for System 2 Planning](https://proceedings.mlr.press/v267/yoon25a.html) 2025
502+
- Jaesik Yoon, Hyeonseo Cho, Doojin Baek, Yoshua Bengio, Sungjin Ahn
503+
- Key: Diffusion Models, MCTS, System 2 Planning, Trajectory Optimization
504+
- ExpEnv: Maze2D, Kitchen, Block stacking
505+
- [Code](https://github.com/ahn-ml/mctd)
506+
- [Monte-Carlo Tree Search with Uncertainty Propagation via Optimal Transport](https://openreview.net/forum?id=DUGFTH9W8B) 2025
507+
- Tuan Quang Dam, Pascal Stenger, Lukas Schneider, Joni Pajarinen, Carlo D’Eramo, Odalric-Ambrym Maillard
508+
- Key: Optimal Transport, Wasserstein Distance, Uncertainty Propagation, MCTS
509+
- ExpEnv: FrozenLake, NChain, RiverSwim, SixArms, Taxi, Rocksample
510+
- [Online Robust Reinforcement Learning Through Monte-Carlo Planning](https://openreview.net/forum?id=m25ma7O7Ec) 2025
511+
- Tuan Quang Dam, Kishan Panaganti, Brahim Driss, Adam Wierman
512+
- Key: Robust RL, MCTS, Distributionally Robust Optimization, Sim-to-Real
513+
- ExpEnv: Gambler’s Problem, Frozen Lake, American Option Pricing
514+
- [Code](https://github.com/brahimdriss/RobustMCTS)
515+
- [Power Mean Estimation in Stochastic Continuous Monte-Carlo Tree Search](https://icml.cc/virtual/2025/poster/45596) 2025
516+
- Tuan Quang Dam
517+
- Key: Continuous MCTS, Polynomial Exploration, Stochastic Environments, Power Mean
518+
- ExpEnv: Continuous Cartpole, Inverted Pendulum
519+
- [Provably Efficient Long-Horizon Exploration in Monte Carlo Tree Search through State Occupancy Regularization](https://arxiv.org/abs/2407.05511) 2024
520+
- Liam Schramm, Abdeslam Boularias
521+
- Key: Exploration, State Occupancy, Long-horizon planning, Volume-MCTS
522+
- ExpEnv: Robot Navigation, 2D Maze
523+
- [Code](https://github.com/schrammlb2/Volume-MCTS-ICML)
498524
#### ICLR
499525
- [OptionZero: Planning with Learned Options](https://openreview.net/forum?id=3IFRygQKGL) 2025
500526
- Po-Wei Huang, Pei-Chiun Peng, Hung Guei, Ti-Rong Wu
@@ -567,6 +593,24 @@ Here is a collection of research papers about **Monte Carlo Tree Search**.
567593
- Binghong Chen, Bo Dai, Qinjie Lin, Guo Ye, Han Liu, Le Song
568594
- Key: meta path planning algorithm, exploits a novel neural architecture which can learn promising search directions from problem structures.
569595
- ExpEnv: a 2d workspace with a 2 DoF (degrees of freedom) point robot, a 3 DoF stick robot and a 5 DoF snake robot
596+
- [Epistemic Monte Carlo Tree Search](https://openreview.net/forum?id=Tb8RiXOc3N) 2025
597+
- Wendelin Boehmer, Zheng Shen, Haoran Duan, Chengzhi Mao, Rosario Scalise
598+
- Key: MCTS, Epistemic Uncertainty, Exploration, Sparse Reward, Model-based RL
599+
- ExpEnv: Deep Sea, SUBLEQ (Assembly language)
600+
- [DeepSeek-Prover-V1.5: Harnessing Proof Assistant Feedback for Reinforcement Learning and Monte-Carlo Tree Search](https://openreview.net/forum?id=I4YAIwrsXa) 2025
601+
- DeepSeek Prover Team
602+
- Key: Automated Theorem Proving, LLM, MCTS, RL from Proof Assistant Feedback (RLPAF), RMaxTS
603+
- ExpEnv: Lean 4, miniF2F, ProofNet
604+
- [Code](https://github.com/deepseek-ai/DeepSeek-Prover-V1.5)
605+
- [Bayes Adaptive Monte Carlo Tree Search for Offline Model-based Reinforcement Learning](https://openreview.net/forum?id=RGjqr1jBJy) 2025
606+
- Lucas Niu Janson, et al.
607+
- Key: Offline RL, Model-based RL, Bayes-Adaptive MDP, Uncertainty Propagation
608+
- ExpEnv: D4RL
609+
- [PromptAgent: Strategic Planning with Large Language Models Enables Expert-Level Prompt Optimization](https://openreview.net/forum?id=22pyNMuIoa) 2024
610+
- Zhutian Yang, et al.
611+
- Key: Prompt Optimization, Strategic Planning, MCTS, LLM Agent
612+
- ExpEnv: BIG-Bench Hard (BBH), MMLU, HellaSwag
613+
- [Code](https://github.com/zhutianyang/PromptAgent)
570614
#### NeurIPS
571615
- [LightZero: A Unified Benchmark for Monte Carlo Tree Search in General Sequential Decision Scenarios](https://openreview.net/pdf?id=oIUXpBnyjv) 2023
572616
- Yazhe Niu, Yuan Pu, Zhenjie Yang, Xueyan Li, Tong Zhou, Jiyuan Ren, Shuai Hu, Hongsheng Li, Yu Liu
@@ -633,7 +677,28 @@ Here is a collection of research papers about **Monte Carlo Tree Search**.
633677
- Matthew Faw, Rajat Sen, Karthikeyan Shanmugam, Constantine Caramanis, Sanjay Shakkottai
634678
- Key: covariate shift problem, Mix&Match combines stochastic gradient descent (SGD) with optimistic tree search and model re-use (evolving partially trained models with samples from different mixture distributions)
635679
- [Code](https://github.com/matthewfaw/mixnmatch)
636-
680+
- [Feedback-Aware MCTS for Goal-Oriented Information Seeking](https://openreview.net/pdf?id=ustF8MMZDJ) 2025
681+
- Harmanpreet Chopra, Chirag Shah
682+
- Key: Conversational AI, Goal-Oriented Information Seeking, MCTS, LLM
683+
- ExpEnv: 20 Questions, GuessWhat?, MutualFriends
684+
- [MCTS-Transfer: Monte Carlo Tree Search based Space Transfer for Black-box Optimization](https://openreview.net/forum?id=T5UfIfmDbq) 2024
685+
- Shukuan Wang, Ke Xue, Lei Song, Xiaobin Huang, Chao Qian
686+
- Key: Black-box Optimization, Transfer Learning, MCTS, Search Space Transfer
687+
- ExpEnv: Synthetic functions (Ackley, etc.), Design-Bench, Hyper-parameter optimization
688+
- [Code](https://github.com/lamda-bbo/mcts-transfer)
689+
- [Speculative Monte-Carlo Tree Search](https://proceedings.neurips.cc/paper_files/paper/2024/file/a19940b01b77b6acd41ff8b32b334e7c-Paper-Conference.pdf) 2024
690+
- Jungwoo Park, David Wu, Kellin Pelrine, Jimmy Wei, Thomas Anthony, Julian Schrittwieser, Junwhan Ahn
691+
- Key: Efficiency, Speculative Execution, Parallelism, AlphaZero
692+
- ExpEnv: Go (9x9, 19x19)
693+
- [Generating Code World Models with Large Language Models Guided by Monte Carlo Tree Search](https://proceedings.neurips.cc/paper_files/paper/2024/file/6f479ea488e0908ac8b1b37b27fd134c-Paper-Conference.pdf) 2024
694+
- Nicola Dainese, Matteo Merler, Minttu Alakuijala, Pekka Marttinen
695+
- Key: Code Generation, World Models, MCTS, Model-based Planning
696+
- ExpEnv: CWMB (Code World Models Benchmark), Crafter
697+
- [ReST-MCTS*: LLM Self-Training via Process Reward Guided Tree Search](https://openreview.net/forum?id=8rcFOqEud5) 2024
698+
- Dan Zhang, Sining Zhoubian, Ziniu Hu, Yisong Yue, Yuxiao Dong, Jie Tang
699+
- Key: LLM Self-training, Process Reward, Reasoning, CoT
700+
- ExpEnv: GSM8K, MATH
701+
- [Code](https://github.com/THUDM/ReST-MCTS)
637702
#### Other Conference or Journal
638703
- [Learning to Stop: Dynamic Simulation Monte-Carlo Tree Search](https://arxiv.org/pdf/2012.07910.pdf) AAAI 2021.
639704
- [On Monte Carlo Tree Search and Reinforcement Learning](https://www.jair.org/index.php/jair/article/download/11099/26289/20632) Journal of Artificial Intelligence Research 2017.

0 commit comments

Comments
 (0)