Skip to content

v0.3.6

Choose a tag to compare

@ShangmingCai ShangmingCai released this 10 Sep 07:46
· 347 commits to main since this release
be89497

What's Changed

  • feat(store): add batch get buffer support by @xiaguan in #671
  • [TransferEngine] optimization: remove request_list parameter from submitTransferTask by @staryxchen in #565
  • feat(transfer_engine_bench): Add multi-GPU support by @staryxchen in #675
  • [CI/Build] Mooncake-common/common.cmake: Add link flag of pthread by @weinanliu in #681
  • [DOC] fix problem in mooncake-store-preview.md by @SgtPepperr in #685
  • [Store] metric: add response struct by @stmatengss in #686
  • [Store] fix: add client list metrics by @stmatengss in #693
  • refactor(offset-allocator): add memory allocation metrics tracking by @xiaguan in #687
  • [TransferEngine] Fix build issues & adapt to latest Mooncake changes by @AscendTransport in #684
  • docs: add RDMA memory registration troubleshooting guide by @xiaguan in #694
  • add instructions for running on AMD GPU by @lihaofd in #689
  • [BugFix] Zero Size RDMA Mem Register by @ykwd in #695
  • refactor(store): move client buffer implementation to store module by @xiaguan in #700
  • refactor(MasterClient): introduce generic RPC invocation helpers by @xiaguan in #697
  • [BugFix] Topology Empty Check Bug by @ykwd in #696
  • bugfix(nvlink): Add explicit P2P access enablement and error handling for NvlinkTransport by @staryxchen in #683
  • [BugFix] Forbid Register Zero Size Memory by @ykwd in #701
  • [TransferEngine] feat: Support CXL shared memory, and provide simple unit tests. by @hemist in #670
  • code format & enable code format checking in ci by @doujiang24 in #677
  • docs: add troubleshooting steps for QP allocation error by @staryxchen in #707
  • [Store] Enhance Master Metrics by @ykwd in #705
  • [store] feat: add master config by @201341 in #650
  • [Store][bind] add new support data types by @stmatengss in #712
  • [TransferEngine] feat: add a universal method to get CXL device size automatically by @StepY1aoZz in #715
  • Reimplement VRAM buffering in TCP transport by @alogfans in #702
  • [Transfer Engine] fix: maximum the memory resource limitation by @stmatengss in #716
  • Fix lint error in transfer_engine_validator.cpp by @SCDESPERTATE in #720
  • Add a topology dumping tool for ease of use by @SCDESPERTATE in #713
  • [TransferEngine] Update to support CANN 8.2.RC1 by @AscendTransport in #714
  • [Store] Optimize Offset Allocator by @ykwd in #706
  • [Store] Add multi-endpoint etcd support for Transfer Engine metadata plugin by @vladnosiv in #729
  • Add ability to do RDMA without nvidia-peermem by @misterwilliam in #704
  • feat: add source code of MXA-EP by @UNIDY2002 in #726
  • refactor(store): move python bidning to pybind_client by @xiaguan in #723
  • [Bugfix] invalidation of one replica results in deletion of the key by @vladnosiv in #731
  • [TransferEngine] exclude packaging ascend precompiled libraries by @hjchen2 in #737
  • [Transfer Engine] Metrics: Add total qp metrics by @stmatengss in #738
  • Fix deleting buffer that doesn't belong to us by @SzymonOzog in #739
  • fix(store): replace CHECK with error handling by @xiaguan in #735
  • feat(client): Add client-side metrics for transfer and RPC operations by @xiaguan in #733
  • [Store]feat: Migrate Persistence Metadata from Client to Master Service by @SgtPepperr in #690
  • Add interface for fuzz match by @XucSh in #734
  • refactor(store_py): Replace function with AutoPortBinder RAII class by @xiaguan in #741
  • [Docs] Minor Update: Explain Return Value of batch_put_from by @ykwd in #747
  • [CI] Avoid Running Deploy Workflow on Forked Repositories by @ykwd in #746
  • Fixed cachelib_memory_allocator dependency. by @karya0 in #750
  • [Store] Refine Complicated Constructor Parameters by @ykwd in #748
  • Update .typos.toml by @ShangmingCai in #756
  • add ascend direct transport by @ascend-direct-dev in #740
  • Fix typo CI by @ShangmingCai in #757
  • [TransferEngine] Add guide in testing Transfer Engine, and remove confusing output in transfer engine by @alogfans in #754
  • [TransferEngine] Ascend supports asymmetric amount of registered memory by @hjchen2 in #758
  • [Doc] Fix 3FS plugin file link problem by @SgtPepperr in #762
  • [Store] Serialize/Deserialize Offset Allocator by @ykwd in #760
  • refactor(store): remove garbage collection implementation by @xiaguan in #763
  • [Doc] Allocator Performance by @ykwd in #774
  • [TE][EndpointStore]: Fix hand_ assignment after evict by @lizhemingi in #768
  • [Doc] update doc for *Regex interface by @XucSh in #776
  • [Store] add c++ http metadata server in mooncake master by @stmatengss in #766
  • [Store] Add replication guarantees by @vladnosiv in #744
  • [Store] Break Circular Dependency Between type.h And Other Files by @ykwd in #771
  • [DOC] Add all badges by @stmatengss in #781
  • [Doc] Add description for CONTRIBUTING.md by @SgtPepperr in #773
  • [TE] Fix adxl error code in ascend-direct-transport by @ascend-direct-dev in #764
  • [Bugfix] YAML CPP has inline impl in header file which will cause linking error by @alexnails in #784
  • [TE] Fix notifs problems in C wrapper by @alogfans in #779
  • fix nixl bench bug by @haobayuxi in #788
  • feat(store): Add largest free region filtering for allocation by @xiaguan in #785
  • Fix Handshake Daemon Initialization Order by @staryxchen in #765
  • [Store] Update stress_cluster_benchmark.py (Multi-thread mooncake store benchmark) by @ChaosD in #791
  • chore(deps): bump tracing-subscriber from 0.3.18 to 0.3.20 in /mooncake-transfer-engine/rust by @dependabot[bot] in #794
  • GIL release for put_tensor and get_tenor by @jerrychenhf in #783
  • Fix: ascend direct transport support host addr type by @ascend-direct-dev in #786
  • [coro_rpc] use client pool and enable rdma by @qicosmos in #789
  • [TransferEngine] Fix SIEVE eviction algorithm for RDMA endpoint store by @KarmaD7 in #767
  • [TE][RDMA Transport]: Simplify Transfer Submission Logic by @staryxchen in #772
  • [TransferEngine] heterogeneous_ascend support kv-cache transfer between npu and gpu by @zuochunwei in #759
  • fix(store): add mutex locks for thread-safe metrics retrieval by @xiaguan in #804
  • feat(build): add CI-specific build option with --ci-build flag by @xiaguan in #808
  • [Store] Remove unnecessary register buffer from put_tensor by @jerrychenhf in #803
  • [CI] Fix Release build_wheel.sh to make python 3.8 auditwheel happy by @mumupika in #801
  • [Chores] Offset Allocator Test Fix & Docs Fix by @ykwd in #806
  • [Doc] Update docs for a better quick start by @chestnut-Q in #814
  • refactor(AutoPortBinder): remove SO_REUSEADDR setting by @xiaguan in #816
  • [RL] Add dummy example of RL training on mooncake store by @Risc-lt in #810
  • [p2pstore] allow topologyMatrix to be empty by @lclgo in #807
  • fix(transfer_engine): use thread-local CURL handles for thread safety by @xiaguan in #815
  • Refactor(store): Enable the transfer engine to autonomously detect by @xiaguan in #817
  • [Store] Fix hf3fs_file.cpp log compile problem by @SgtPepperr in #811
  • feat(store): add duplicate rpc_meta key check and CI integration by @xiaguan in #818
  • chore: bump version to 0.3.6 in pyproject.toml by @ShangmingCai in #819

New Contributors

Full Changelog: v0.3.5...v0.3.6