Releases: opendilab/LightZero
v0.2.0
Environment
- Add Metadrive environment and its configurations (#192)
- Add Sampled MuZero/UniZero and DMC environment with related configurations (#260)
- Polish Chess environment and its render method; add unit tests and configurations (#272)
- Add Jericho environment and its related configurations (#307)
Algorithm
- Add Harmony Dream loss balance in MuZero (#242)
- Adopt AlphaZero for non-zero-sum games (#245)
- Add AlphaZero CTree unittest (#306)
- Add recent MCTS-related papers (#324)
- Introduce rope to use the true timestep index as pos_index (#266)
- Add Jericho DDP configuration (#337)
Enhancement
- Add LightZero Sphinx documentation (#237)
- Add Wandb support (#294)
- Add Atari100k metric utilities (#295)
- Add eval_benchmark tests (#296)
- Include save_replay and collect_episode_data options in Jericho (#333)
- Add an MCTS TicTacToe demo in a single file (#315)
Polish
- Polish efficiency and performance on Atari and DMC (#292)
- Update requirements (#298)
- Optimize reward/value/policy_head_hidden_channels (#314)
- Update configuration and log instructions in tutorials (#330)
Fix
- Fix DownSample issues for different observation shapes (#254)
- Fix the wrong chance values in Stochastic MuZero (#275)
- Use display_frames_as_gif in CartPole (#288)
- Fix the chance encoder in stochastic_muzero_model_mlp.py (#284)
- Correct typo in model/utils.py (#290)
- Fix SMZ compile_args and num_simulations bug in world_model (#297)
- Fix reward type bug in 2048 and OS import issue in CartPole (#304)
- Switch to macos-13 in action (#319)
- Fix SMZ & SEZ config for pixel-based DMC (#322)
- Fix update_per_collect in DDP setting (#321)
- Fix bug with obs_shape tuple in initialize_zeros_batch (#327)
- Fix prepare_obs_stack_for_unizero issue (#328)
- Fix random_policy when len(ready_env_id) < collector_env_num (#335)
- Fix timestep compatibility issues (#339)
CI & Test
Full Changelog: v0.1.0...v0.2.0
Contributors: @ruiheng123 @TuTuHuss @HarryXuancy @ShivamKumar2002 @Roland0511 @cmarlin @xiongjyu @PaParaZz1 @puyuan1996
v0.1.0
Environment
- SumToThree Env from pooltool (#227)
Algorithm
Enhancement
- add logs and config documentations (#220)
- polish atari_env_action_space_map, fix test_muzero_game_buffer
- polish release.yml
Style
- update discord link and add a new badge in readme (#221)
Full Changelog: v0.0.5...v0.1.0
Contributors: @ekiefl @TuTuHuss @HarryXuancy @PaParaZz1 @puyuan1996
v0.0.5
Environment
Algorithm
- Gumbel AlphaZero in ctree (#212)
Enhancement
- add eval_offline option (#188)
- save the updated searched policy and value to the buffer during reanalyze (#190)
- add muzero visualization (#181)
- add efficientzero tictactoe configs (#204)
- add 2 mcts related iclr2024 papers
- add load pretrained model option in test_game_segment (#194)
- polish _forward_learn() and some data process operations (#191)
Fix
- fix sync_gradients and log in DDP settings (#200)
- fix channel_last bug
- fix total_episode_count bug in collector
- fix memory_lightzero_env return bug
- fix obs_max_scale bug in memory_env
Style
- add ZeroPal and discord link (#209)
- add unittest for game_buffer_muzero (#186)
- add customization documentation section in readme
Full Changelog: v0.0.4...v0.0.5
Contributors: @karroyan @HarryXuancy @nighood @puyuan1996
v0.0.4
Enhancement
- add agent configurations & polish replay video saving method (#184)
- polish comments in worker files
- polish comments in tree search files (#185)
- rename mcts_mode to battle_mode_in_simulation_env, add sampled alphazero config for tictactoe (#179)
- polish redundant data squeeze operations (#177)
- polish the continuous action process in sez model
- polish bipedalwalker env
Fix
- fix completed value inf bug when zero exists in action_mask in gumbel muzero (#178)
- fix render settings when using gymnasium (#173)
- fix lstm_hidden_size in sampled_efficientzero_model.py
- fix action_mask in bipedalwalker_cont_disc_env, fix device bug in sampled efficientzero (#168)
Full Changelog: v0.0.3...v0.0.4
Contributors: @karroyan @HarryXuancy @puyuan1996 @zjowowen
v0.0.3
Env
Algorithm
Enhancement
- add ctree version of mcts in alphazero (#142)
- upgrade the dependency on gym with gymnasium (#150)
- add agent class to support LightZero's HuggingFace Model Zoo (#163)
- add recent MCTS-related papers in readme (#159)
- add muzero config for connect4 (#107)
- add CONTRIBUTING.md (#119)
- add .gitpod.yml and .gitpod.Dockerfile (#123)
- add contributors subsection in README (#132)
- add CODE_OF_CONDUCT.md (#127)
- polish comments and render_eval configs for various common envs (#154) (#161)
- polish action_type and env_type, fix test.yml, fix unittest (#160)
- update env and algo tutorial doc (#106)
- polish gomoku env (#141)
- add random_policy support for continuous env (#118)
- polish simulation method of ptree_az (#120)
- polish comments of game_segment_to_array
Fix
- fix render method for various common envs (#154) (#161)
- fix gumbel muzero collector bug, fix gumbel typo (#144)
- fix assert bug in game_segment.py (#138)
- fix visit_count_distributions name in muzero_evaluator
- fix mcts and alphabeta bot unittest (#120)
- fix typos in ptree_mz.py (#113)
- fix root_sampled_actions_tmp shape bug in sez ptree
- fix policy utils unittest
- fix typo in readme and add a 'back to top' button in readme (#104) (#109) (#111)
Style
- add NeurIPS 2023 paper link
News
- NeurIPS 2023 Spotlight: LightZero: A Unified Benchmark for Monte Carlo Tree Search in General Sequential Decision Scenarios
Full Changelog: v0.0.2...v0.0.3
Contributors: @PaParaZz1 @karroyan @nighood @jayyoung0802 @timothijoe @TuTuHuss @HarryXuancy @puyuan1996 @HansBug @mohitd404 @@PentesterPriyanshu @0Armaan025 @prajjwalyd @suravshresth @sohamtembhurne @eltociear
v0.0.2
Env
Algorithm
Enhancement
- polish mcts and ptree_az (#57) (#61)
- polish readme (#36) (#47) (#51) (#77) (#95) (#96)
- update paper notes (#89) (#91)
- polish model and configs (#26) (#27) (#50)
- add Dockerfile and its usage instructions (#95)
- add doc about how to customize envs and algos (#78)
- add pytorch ddp support (#68)
- add eps greedy and random collect option in train_muzero_entry (#54)
- add atari visualization option (#40)
- add log_buffer_memory_usage utils (#30)
Fix
- fix priority bug in muzero collector (#74)
Style
Full Changelog: v0.0.1...v0.0.2
Contributors: @PaParaZz1 @karroyan @nighood @jayyoung0802 @timothijoe @TuTuHuss @HarryXuancy @puyuan1996 @HansBug
v0.0.1
Full Changelog: https://github.com/opendilab/LightZero/commits/v0.0.1