v0.1.2
What's Changed
- feat: use zmq_addr_counter to make zmq_handle non-repeat for each update by @weixiao-huang in #4
- feat: add pre-commit as lint config by @weixiao-huang in #5
- feat: add pre-commit CI workflow by @specture724 in #10
- feat: make
ParameterMetaJSON serializable by @weixiao-huang in #9 - feat: rename
save_metas_file->load_metas_fileinjoinmethod by @weixiao-huang in #11 - chore: set
mooncake-transfer-engine>=0.3.5by @weixiao-huang in #13 - feat: support uds and use httpx instead of requests by @weixiao-huang in #18
- feat: add rank and world_size args in ParameterServer by @weixiao-huang in #20
- feat: use ibv_get_device_list to get rdma devices instead of getting from file by @weixiao-huang in #19
- feat: use torch.cuda.get_device_properties() to get device_uuid instead of nvidia-smi -L by @weixiao-huang in #21
- hotfix: use correct hca selector by @weixiao-huang in #22
- feat: add _TorchTensor type for pydantic type validator by @weixiao-huang in #24
Full Changelog: https://github.com/MoonshotAI/checkpoint-engine/commits/v0.1.2