09 Apr 09:06

puyuan1996

44a23b5

v0.2.0 Latest

Latest

Environment

Add Metadrive environment and its configurations (#192)
Add Sampled MuZero/UniZero and DMC environment with related configurations (#260)
Polish Chess environment and its render method; add unit tests and configurations (#272)
Add Jericho environment and its related configurations (#307)

Algorithm

Add Harmony Dream loss balance in MuZero (#242)
Adopt AlphaZero for non-zero-sum games (#245)
Add AlphaZero CTree unittest (#306)
Add recent MCTS-related papers (#324)
Introduce rope to use the true timestep index as pos_index (#266)
Add Jericho DDP configuration (#337)

Enhancement

Add LightZero Sphinx documentation (#237)
Add Wandb support (#294)
Add Atari100k metric utilities (#295)
Add eval_benchmark tests (#296)
Include save_replay and collect_episode_data options in Jericho (#333)
Add an MCTS TicTacToe demo in a single file (#315)

Polish

Polish efficiency and performance on Atari and DMC (#292)
Update requirements (#298)
Optimize reward/value/policy_head_hidden_channels (#314)
Update configuration and log instructions in tutorials (#330)

Fix

Fix DownSample issues for different observation shapes (#254)
Fix the wrong chance values in Stochastic MuZero (#275)
Use display_frames_as_gif in CartPole (#288)
Fix the chance encoder in stochastic_muzero_model_mlp.py (#284)
Correct typo in model/utils.py (#290)
Fix SMZ compile_args and num_simulations bug in world_model (#297)
Fix reward type bug in 2048 and OS import issue in CartPole (#304)
Switch to macos-13 in action (#319)
Fix SMZ & SEZ config for pixel-based DMC (#322)
Fix update_per_collect in DDP setting (#321)
Fix bug with obs_shape tuple in initialize_zeros_batch (#327)
Fix prepare_obs_stack_for_unizero issue (#328)
Fix random_policy when len(ready_env_id) < collector_env_num (#335)
Fix timestep compatibility issues (#339)

CI & Test

Add self-hosted Linux (Ubuntu) CI runner (#259)
Add self-hosted Linux runner for CI tests (#323)

Full Changelog: v0.1.0...v0.2.0

Contributors: @ruiheng123 @TuTuHuss @HarryXuancy @ShivamKumar2002 @Roland0511 @cmarlin @xiongjyu @PaParaZz1 @puyuan1996

Contributors

Roland0511, cmarlin, and 7 other contributors

Assets 21

lightzero-0.2.0-cp310-cp310-macosx_10_9_x86_64.whl

1.55 MB 2025-04-09T11:36:37Z
lightzero-0.2.0-cp310-cp310-macosx_11_0_arm64.whl

1.51 MB 2025-04-09T11:36:36Z
lightzero-0.2.0-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl

6.31 MB 2025-04-09T11:36:35Z
lightzero-0.2.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl

6.41 MB 2025-04-09T11:36:34Z
lightzero-0.2.0-cp311-cp311-macosx_10_9_x86_64.whl

1.56 MB 2025-04-09T11:36:33Z
lightzero-0.2.0-cp311-cp311-macosx_11_0_arm64.whl

1.51 MB 2025-04-09T11:36:32Z
lightzero-0.2.0-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl

6.57 MB 2025-04-09T11:36:31Z
lightzero-0.2.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl

6.67 MB 2025-04-09T11:36:29Z
LightZero-0.2.0-cp37-cp37m-macosx_10_9_x86_64.whl

1.56 MB 2025-04-09T11:36:40Z
LightZero-0.2.0-cp37-cp37m-manylinux_2_17_aarch64.manylinux2014_aarch64.whl

6.18 MB 2025-04-09T11:36:39Z
Source code (zip)

2025-04-09T09:06:02Z
Source code (tar.gz)

2025-04-09T09:06:02Z

12 Jul 10:26

puyuan1996

v0.1.0

fdd414c

v0.1.0

Environment

SumToThree Env from pooltool (#227)

Algorithm

UniZero (#232)
ReZero (#238)

Enhancement

add logs and config documentations (#220)
polish atari_env_action_space_map, fix test_muzero_game_buffer
polish release.yml

Style

update discord link and add a new badge in readme (#221)

Full Changelog: v0.0.5...v0.1.0

Contributors: @ekiefl @TuTuHuss @HarryXuancy @PaParaZz1 @puyuan1996

Contributors

ekiefl, PaParaZz1, and 3 other contributors

Assets 22

16 Apr 10:41

github-actions

v0.0.5

5190fd3

v0.0.5

Environment

MemoryEnv (#197)
MountainCar (#181)

Algorithm

Gumbel AlphaZero in ctree (#212)

Enhancement

add eval_offline option (#188)
save the updated searched policy and value to the buffer during reanalyze (#190)
add muzero visualization (#181)
add efficientzero tictactoe configs (#204)
add 2 mcts related iclr2024 papers
add load pretrained model option in test_game_segment (#194)
polish _forward_learn() and some data process operations (#191)

Fix

fix sync_gradients and log in DDP settings (#200)
fix channel_last bug
fix total_episode_count bug in collector
fix memory_lightzero_env return bug
fix obs_max_scale bug in memory_env

Style

add ZeroPal and discord link (#209)
add unittest for game_buffer_muzero (#186)
add customization documentation section in readme

Full Changelog: v0.0.4...v0.0.5

Contributors: @karroyan @HarryXuancy @nighood @puyuan1996

Contributors

karroyan, nighood, and 2 other contributors

Assets 22

21 Feb 10:16

puyuan1996

v0.0.4

cea7176

v0.0.4

Enhancement

add agent configurations & polish replay video saving method (#184)
polish comments in worker files
polish comments in tree search files (#185)
rename mcts_mode to battle_mode_in_simulation_env, add sampled alphazero config for tictactoe (#179)
polish redundant data squeeze operations (#177)
polish the continuous action process in sez model
polish bipedalwalker env

Fix

fix completed value inf bug when zero exists in action_mask in gumbel muzero (#178)
fix render settings when using gymnasium (#173)
fix lstm_hidden_size in sampled_efficientzero_model.py
fix action_mask in bipedalwalker_cont_disc_env, fix device bug in sampled efficientzero (#168)

Full Changelog: v0.0.3...v0.0.4

Contributors: @karroyan @HarryXuancy @puyuan1996 @zjowowen

Contributors

karroyan, puyuan1996, and 2 other contributors

Assets 2

07 Dec 08:27

puyuan1996

v0.0.3

3cb7fff

v0.0.3

Env

MiniGrid env (#110)
Bsuite env (#110)
GoBigger env (#39)

Algorithm

Sampled AlphaZero (#141)
MuZero+RND (#110)
Multi-Agent MuZero/EfficientZero (#39)

Enhancement

add ctree version of mcts in alphazero (#142)
upgrade the dependency on gym with gymnasium (#150)
add agent class to support LightZero's HuggingFace Model Zoo (#163)
add recent MCTS-related papers in readme (#159)
add muzero config for connect4 (#107)
add CONTRIBUTING.md (#119)
add .gitpod.yml and .gitpod.Dockerfile (#123)
add contributors subsection in README (#132)
add CODE_OF_CONDUCT.md (#127)
polish comments and render_eval configs for various common envs (#154) (#161)
polish action_type and env_type, fix test.yml, fix unittest (#160)
update env and algo tutorial doc (#106)
polish gomoku env (#141)
add random_policy support for continuous env (#118)
polish simulation method of ptree_az (#120)
polish comments of game_segment_to_array

Fix

fix render method for various common envs (#154) (#161)
fix gumbel muzero collector bug, fix gumbel typo (#144)
fix assert bug in game_segment.py (#138)
fix visit_count_distributions name in muzero_evaluator
fix mcts and alphabeta bot unittest (#120)
fix typos in ptree_mz.py (#113)
fix root_sampled_actions_tmp shape bug in sez ptree
fix policy utils unittest
fix typo in readme and add a 'back to top' button in readme (#104) (#109) (#111)