v0.3.6
What's Changed
- feat(store): add batch get buffer support by @xiaguan in #671
- [TransferEngine] optimization: remove request_list parameter from submitTransferTask by @staryxchen in #565
- feat(transfer_engine_bench): Add multi-GPU support by @staryxchen in #675
- [CI/Build] Mooncake-common/common.cmake: Add link flag of pthread by @weinanliu in #681
- [DOC] fix problem in mooncake-store-preview.md by @SgtPepperr in #685
- [Store] metric: add response struct by @stmatengss in #686
- [Store] fix: add client list metrics by @stmatengss in #693
- refactor(offset-allocator): add memory allocation metrics tracking by @xiaguan in #687
- [TransferEngine] Fix build issues & adapt to latest Mooncake changes by @AscendTransport in #684
- docs: add RDMA memory registration troubleshooting guide by @xiaguan in #694
- add instructions for running on AMD GPU by @lihaofd in #689
- [BugFix] Zero Size RDMA Mem Register by @ykwd in #695
- refactor(store): move client buffer implementation to store module by @xiaguan in #700
- refactor(MasterClient): introduce generic RPC invocation helpers by @xiaguan in #697
- [BugFix] Topology Empty Check Bug by @ykwd in #696
- bugfix(nvlink): Add explicit P2P access enablement and error handling for NvlinkTransport by @staryxchen in #683
- [BugFix] Forbid Register Zero Size Memory by @ykwd in #701
- [TransferEngine] feat: Support CXL shared memory, and provide simple unit tests. by @hemist in #670
- code format & enable code format checking in ci by @doujiang24 in #677
- docs: add troubleshooting steps for QP allocation error by @staryxchen in #707
- [Store] Enhance Master Metrics by @ykwd in #705
- [store] feat: add master config by @201341 in #650
- [Store][bind] add new support data types by @stmatengss in #712
- [TransferEngine] feat: add a universal method to get CXL device size automatically by @StepY1aoZz in #715
- Reimplement VRAM buffering in TCP transport by @alogfans in #702
- [Transfer Engine] fix: maximum the memory resource limitation by @stmatengss in #716
- Fix lint error in
transfer_engine_validator.cppby @SCDESPERTATE in #720 - Add a topology dumping tool for ease of use by @SCDESPERTATE in #713
- [TransferEngine] Update to support CANN 8.2.RC1 by @AscendTransport in #714
- [Store] Optimize Offset Allocator by @ykwd in #706
- [Store] Add multi-endpoint etcd support for Transfer Engine metadata plugin by @vladnosiv in #729
- Add ability to do RDMA without nvidia-peermem by @misterwilliam in #704
- feat: add source code of MXA-EP by @UNIDY2002 in #726
- refactor(store): move python bidning to
pybind_clientby @xiaguan in #723 - [Bugfix] invalidation of one replica results in deletion of the key by @vladnosiv in #731
- [TransferEngine] exclude packaging ascend precompiled libraries by @hjchen2 in #737
- [Transfer Engine] Metrics: Add total qp metrics by @stmatengss in #738
- Fix deleting buffer that doesn't belong to us by @SzymonOzog in #739
- fix(store): replace CHECK with error handling by @xiaguan in #735
- feat(client): Add client-side metrics for transfer and RPC operations by @xiaguan in #733
- [Store]feat: Migrate Persistence Metadata from Client to Master Service by @SgtPepperr in #690
- Add interface for fuzz match by @XucSh in #734
- refactor(store_py): Replace function with AutoPortBinder RAII class by @xiaguan in #741
- [Docs] Minor Update: Explain Return Value of batch_put_from by @ykwd in #747
- [CI] Avoid Running Deploy Workflow on Forked Repositories by @ykwd in #746
- Fixed cachelib_memory_allocator dependency. by @karya0 in #750
- [Store] Refine Complicated Constructor Parameters by @ykwd in #748
- Update .typos.toml by @ShangmingCai in #756
- add ascend direct transport by @ascend-direct-dev in #740
- Fix typo CI by @ShangmingCai in #757
- [TransferEngine] Add guide in testing Transfer Engine, and remove confusing output in transfer engine by @alogfans in #754
- [TransferEngine] Ascend supports asymmetric amount of registered memory by @hjchen2 in #758
- [Doc] Fix 3FS plugin file link problem by @SgtPepperr in #762
- [Store] Serialize/Deserialize Offset Allocator by @ykwd in #760
- refactor(store): remove garbage collection implementation by @xiaguan in #763
- [Doc] Allocator Performance by @ykwd in #774
- [TE][EndpointStore]: Fix hand_ assignment after evict by @lizhemingi in #768
- [Doc] update doc for *Regex interface by @XucSh in #776
- [Store] add c++ http metadata server in mooncake master by @stmatengss in #766
- [Store] Add replication guarantees by @vladnosiv in #744
- [Store] Break Circular Dependency Between type.h And Other Files by @ykwd in #771
- [DOC] Add all badges by @stmatengss in #781
- [Doc] Add description for CONTRIBUTING.md by @SgtPepperr in #773
- [TE] Fix adxl error code in ascend-direct-transport by @ascend-direct-dev in #764
- [Bugfix] YAML CPP has inline impl in header file which will cause linking error by @alexnails in #784
- [TE] Fix notifs problems in C wrapper by @alogfans in #779
- fix nixl bench bug by @haobayuxi in #788
- feat(store): Add largest free region filtering for allocation by @xiaguan in #785
- Fix Handshake Daemon Initialization Order by @staryxchen in #765
- [Store] Update stress_cluster_benchmark.py (Multi-thread mooncake store benchmark) by @ChaosD in #791
- chore(deps): bump tracing-subscriber from 0.3.18 to 0.3.20 in /mooncake-transfer-engine/rust by @dependabot[bot] in #794
- GIL release for put_tensor and get_tenor by @jerrychenhf in #783
- Fix: ascend direct transport support host addr type by @ascend-direct-dev in #786
- [coro_rpc] use client pool and enable rdma by @qicosmos in #789
- [TransferEngine] Fix SIEVE eviction algorithm for RDMA endpoint store by @KarmaD7 in #767
- [TE][RDMA Transport]: Simplify Transfer Submission Logic by @staryxchen in #772
- [TransferEngine] heterogeneous_ascend support kv-cache transfer between npu and gpu by @zuochunwei in #759
- fix(store): add mutex locks for thread-safe metrics retrieval by @xiaguan in #804
- feat(build): add CI-specific build option with --ci-build flag by @xiaguan in #808
- [Store] Remove unnecessary register buffer from put_tensor by @jerrychenhf in #803
- [CI] Fix Release build_wheel.sh to make python 3.8 auditwheel happy by @mumupika in #801
- [Chores] Offset Allocator Test Fix & Docs Fix by @ykwd in #806
- [Doc] Update docs for a better quick start by @chestnut-Q in #814
- refactor(AutoPortBinder): remove SO_REUSEADDR setting by @xiaguan in #816
- [RL] Add dummy example of RL training on mooncake store by @Risc-lt in #810
- [p2pstore] allow topologyMatrix to be empty by @lclgo in #807
- fix(transfer_engine): use thread-local CURL handles for thread safety by @xiaguan in #815
- Refactor(store): Enable the transfer engine to autonomously detect by @xiaguan in #817
- [Store] Fix hf3fs_file.cpp log compile problem by @SgtPepperr in #811
- feat(store): add duplicate rpc_meta key check and CI integration by @xiaguan in #818
- chore: bump version to 0.3.6 in pyproject.toml by @ShangmingCai in #819
New Contributors
- @weinanliu made their first contribution in #681
- @lihaofd made their first contribution in #689
- @hemist made their first contribution in #670
- @StepY1aoZz made their first contribution in #715
- @vladnosiv made their first contribution in #729
- @misterwilliam made their first contribution in #704
- @UNIDY2002 made their first contribution in #726
- @hjchen2 made their first contribution in #737
- @SzymonOzog made their first contribution in #739
- @karya0 made their first contribution in #750
- @ascend-direct-dev made their first contribution in #740
- @lizhemingi made their first contribution in #768
- @alexnails made their first contribution in #784
- @ChaosD made their first contribution in #791
- @jerrychenhf made their first contribution in #783
- @KarmaD7 made their first contribution in #767
- @zuochunwei made their first contribution in #759
- @mumupika made their first contribution in #801
- @lclgo made their first contribution in #807
Full Changelog: v0.3.5...v0.3.6