You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
- [Power Mean Estimation in Stochastic Continuous Monte-Carlo Tree Search](https://icml.cc/virtual/2025/poster/45596) 2025
516
+
- Tuan Quang Dam
517
+
- Key: Continuous MCTS, Polynomial Exploration, Stochastic Environments, Power Mean
518
+
- ExpEnv: Continuous Cartpole, Inverted Pendulum
519
+
- [Provably Efficient Long-Horizon Exploration in Monte Carlo Tree Search through State Occupancy Regularization](https://arxiv.org/abs/2407.05511) 2024
520
+
- Liam Schramm, Abdeslam Boularias
521
+
- Key: Exploration, State Occupancy, Long-horizon planning, Volume-MCTS
- [DeepSeek-Prover-V1.5: Harnessing Proof Assistant Feedback for Reinforcement Learning and Monte-Carlo Tree Search](https://openreview.net/forum?id=I4YAIwrsXa) 2025
- [PromptAgent: Strategic Planning with Large Language Models Enables Expert-Level Prompt Optimization](https://openreview.net/forum?id=22pyNMuIoa) 2024
@@ -633,7 +677,28 @@ Here is a collection of research papers about **Monte Carlo Tree Search**.
633
677
- Matthew Faw, Rajat Sen, Karthikeyan Shanmugam, Constantine Caramanis, Sanjay Shakkottai
634
678
- Key: covariate shift problem, Mix&Match combines stochastic gradient descent (SGD) with optimistic tree search and model re-use (evolving partially trained models with samples from different mixture distributions)
635
679
- [Code](https://github.com/matthewfaw/mixnmatch)
636
-
680
+
- [Feedback-Aware MCTS for Goal-Oriented Information Seeking](https://openreview.net/pdf?id=ustF8MMZDJ) 2025
681
+
- Harmanpreet Chopra, Chirag Shah
682
+
- Key: Conversational AI, Goal-Oriented Information Seeking, MCTS, LLM
683
+
- ExpEnv: 20 Questions, GuessWhat?, MutualFriends
684
+
- [MCTS-Transfer: Monte Carlo Tree Search based Space Transfer for Black-box Optimization](https://openreview.net/forum?id=T5UfIfmDbq) 2024
685
+
- Shukuan Wang, Ke Xue, Lei Song, Xiaobin Huang, Chao Qian
686
+
- Key: Black-box Optimization, Transfer Learning, MCTS, Search Space Transfer
- [Generating Code World Models with Large Language Models Guided by Monte Carlo Tree Search](https://proceedings.neurips.cc/paper_files/paper/2024/file/6f479ea488e0908ac8b1b37b27fd134c-Paper-Conference.pdf) 2024
694
+
- Nicola Dainese, Matteo Merler, Minttu Alakuijala, Pekka Marttinen
695
+
- Key: Code Generation, World Models, MCTS, Model-based Planning
696
+
- ExpEnv: CWMB (Code World Models Benchmark), Crafter
697
+
- [ReST-MCTS*: LLM Self-Training via Process Reward Guided Tree Search](https://openreview.net/forum?id=8rcFOqEud5) 2024
698
+
- Dan Zhang, Sining Zhoubian, Ziniu Hu, Yisong Yue, Yuxiao Dong, Jie Tang
699
+
- Key: LLM Self-training, Process Reward, Reasoning, CoT
700
+
- ExpEnv: GSM8K, MATH
701
+
- [Code](https://github.com/THUDM/ReST-MCTS)
637
702
#### Other Conference or Journal
638
703
- [Learning to Stop: Dynamic Simulation Monte-Carlo Tree Search](https://arxiv.org/pdf/2012.07910.pdf) AAAI 2021.
639
704
- [On Monte Carlo Tree Search and Reinforcement Learning](https://www.jair.org/index.php/jair/article/download/11099/26289/20632) Journal of Artificial Intelligence Research 2017.
0 commit comments