Release v0.3.8 · kvcache-ai/Mooncake

What's Changed

adxl: fix aclrtMemcpyBatch max 4096 limit bug by @ascend-direct-dev in #963
ci: add non-CUDA release workflow and update documentation by @xiaguan in #969
fix(transfer_engine): Add notify callback registration in RPC metadata handling by @iBenzene in #966
[Misc] (Mooncake Backend) Early break if a rank failure can be determined through ping message by @UNIDY2002 in #980
Revise installation instructions for non-cuda mooncake by @ShangmingCai in #985
[Store] feat: store key-value data in buckets by @zhuxinjie-nz in #968
Support AMDGPU (refactor CUDA-alike) by @yeahdongcn in #973
add log by @ascend-direct-dev in #996
Feature: support custom key prefix for issue 957 by @uniqueni in #958
refactor(store): store remove transfer engine internal api usage by @xiaguan in #994
[TransferEngine] Mitigating performance overhead from large cluster and large bulks by @alogfans in #999
Bump version to 0.3.7.post1 in pyproject.toml by @ShangmingCai in #984
[store] feat: add secondary storage usage monitor by @yejj710 in #976
fix(ci): remove nvlink allocator --ci-build flag by @xiaguan in #1003
[DOC] Update fig by @stmatengss in #987
Modify build command for nvlink_allocator by @ShangmingCai in #1001
fix(ci): add id-token permission and unify PyPI token for release by @xiaguan in #1004
[Misc] Lazy import ep in mooncake_ep_buffer.py by @UNIDY2002 in #1014
Bump version to 0.3.7.post2 in pyproject.toml by @ShangmingCai in #1015
[Store] Fix CI bugs & Improve log output & Refactor TE Initialization by @ykwd in #1006
[CI] Fix a CI BUG in PyClientTest:TestSetupExistTransferEngine by @ykwd in #1016
[Misc] Remove EP's duplicated impl of getCudaTopologyJson by @UNIDY2002 in #1009
[Store]: Cleanup processing objects if transferring timedout (#975) by @nickyc975 in #993
[Store] support segment level metrics(fix code format of #1029) by @cocktail828 in #1030
[Store] One Replica Has One Slice by @ykwd in #1032
[DOC] Update Slack link in README.md by @stmatengss in #1042
[Doc] Update SGLang Hicache Docs by @ykwd in #1023
[Store] add choosing endpoint store option by @stmatengss in #1024
[feat]More KVCache metrics in both master/client side by @Liziqi-77 in #1020
change adxl log by @ascend-direct-dev in #1039
Te seperated compilation by @zhaoyongke in #1041
Fix TCP Transport Handshake Daemon Initialization by @staryxchen in #846
add batch [put/get] tensor by @XucSh in #1044
handling cudaMemcpy errors in tcp_transport.cpp by @flying-x in #1057
[Store] fix: honor MC_MS_FILTERS by applying whitelist before TransferEngine init by @wwq2333 in #1051
[Doc] Add license badge to README by @stmatengss in #1063
[DOC] add web api doc by @stmatengss in #1059
[Chore] Add Contributor Covenant Code of Conduct by @stmatengss in #1056
Add pull request template for standardized PR submissions by @Copilot in #1065
[Store|TransferEngine]: use condition-variable based completion instead of busy-polling by @wwq2333 in #1053
docs: update README by @zhyncs in #1079
[store] Add disk eviction feature by @Vincent-Bo-ali in #1028
[Store] MasterMetricManager Returns Zero-Value Variables by @ykwd in #1068
[RDMA] Fix RDMA device selection to prioritize GIDs with network devices by @uniqueni in #1077
[CI] add sglang integration test by @stmatengss in #1089
[EP] Fallback impl of Mooncake EP when IBGDA is unavailable by @UNIDY2002 in #1002
TCP transport support ipv6 by @LCAIZJ in #1067
[Store] add version checking between client and server by @stmatengss in #1061
[TE/Topology] Support device filtering when dumping topology by @popsiclexu in #1087
[BugFix] Adapt mooncake_connector_v1 to latest vllm by @ZeldaHuang in #1080
[CI] Add label event by @XucSh in #1108
Adapt to adxl connection auto release feature by @ascend-direct-dev in #1072
feat[Store]: Add standalone deployment implementation for Client by @YiXR in #1084
feat[accl-barex]: add barex_transport by build with USE_BAREX by @ZechaoZhang-beta in #1045
Update CI by @XucSh in #1111
[TE/Topology] Enhance PCI distance calculation by considering NUMA node affinity by @popsiclexu in #1086
[DEV] add pre-commit by @stmatengss in #1124
[Store] Cancel all negative ret val by @Azure-Tang in #1129
[Store] fix compilation warning in storage backend by @stmatengss in #1134
[EP] Support multiple torch versions by @UNIDY2002 in #1098
feat[Store]: Add multi dummy clients support for real client by @YiXR in #1122
[Store] Add support for static labels (host IP/cluster name) in client metrics by @cocktail828 in #1081
[Misc] Add Codeowners by @ykwd in #1135
fix MC_MAX_EP_PER_CTX doc by @whybeyoung in #1142
[Bug] fixed bug of master not using glog actually by @SpecterCipher in #1075
change cmake by @ascend-direct-dev in #1114
feat[Store]: Refine shm mmap logic and add reconnection for Dummy Client after the Real restarted by @YiXR in #1146
[mooncake-store]: prevent orphaned bucket data files from leaking dis… by @maheshrbapatu in #1140
[store] Fix IPv6 link-local address parsing and add IPv6 tests by @Azure-Tang in #1137
[CI] Install CUDA toolkit on job test-wheel-ubuntu, so that the wheel can be built with USE_CUDA=ON by @UNIDY2002 in #1156
[store] add pybind for get_replica_desc by @yejj710 in #1121
[Store]: Refactor AllocationStrategy implementation for better performance and flexibility by @nickyc975 in #1149
[Store] Optimize master & client binary size by @YiXR in #1166
Add a CI test for Mooncake EP Backend (CPU only) by @UNIDY2002 in #1099
Improve AMD HIP support with hipify-perl by @amd-arozanov in #1154
[DOC] Add MAINTAINERS.md by @alogfans in #1171
[MUSA] Enable USE_MNNVL by @yeahdongcn in #1176
[Store] feat: Add BatchQueryIp API for querying multiple client IPs by @Vincent-Bo-ali in #1162
[Store] pub_tensor for multiple replica by @zxpdemonio in #1148
[Store] feat: Implement a FileStorage component to manage the lifecycle of key-value data by @zhuxinjie-nz in #1031
[Doc] add docs of Mooncake EP integration with SGLang by @UNIDY2002 in #1188
Pr1 coro rpc core by @JasonZhang517 in #1104
[TE] Support rdma traffic class by environmental variable by @yafengio in #1187
[Store] add tp awareness for get_tensor by @XucSh in #1127
feat[Store]: Introduce shm helper for dummy by @YiXR in #1177
[yalantinglibs]set ylt log level with env by @qicosmos in #1190
[TE/Examples] Memory initialization and HIP cleanup fixes by @amd-arozanov in #1179
[TE] AscendDirectTransport: HIXL support IPV6 by @MingYang119 in #1194
[Store] feat: Add BatchReplicaClear API for manual cache cleanup by @Vincent-Bo-ali in #1191
[Store]: Add master_bench for benchmarking QPS of MasterService by @nickyc975 in #1201
[doc]merge doc to docs，and change all internal links to blog. by @Keithwwa in #1153
【docu】Remove duplicate mooncake store by @Keithwwa in #1211
feat: add PCIe Relaxed Ordering (RO) support and RDMA traffic class (… by @1998zxn in #1076
[CI] Add sglang e2e tests by @luketong777 in #1181
feat[Store]: add multi shm support for dummy and real client by @YiXR in #1206
[bugfix] fix bfloat16 for get_tensor by @XucSh in #1216
[TE]: Add HIP transport for AMD GPUs support by @amd-arozanov in #1208
[TE/HIP] Fix HIP Shareable POSIX File Descriptor Handles by @amd-arozanov in #1218
[TE] Memorize batched transfer status by @alogfans in #1205
[TransferEngine]: HIXL support ipv6 when searching for available port by @MingYang119 in #1220
[EP] Fix the tensorSize of the barrier op by @UNIDY2002 in #1222
refactor tensor api and add tests by @XucSh in #1217
[Store] feat: Implement a unified storage interface to simplify integration and extension by @zhangzuo21 in #1185
Fix: add missing include path for cuda_alike.h in mooncake-transfer-engine/nvlink-allocator/build.sh by @yeahdongcn in #1224
feat(metrics): add task completion latency tracking and reporting by @staryxchen in #1130
[CI] fix: don't skip any CI test by @stmatengss in #1229
Fix compilation warnings for missing field initializers by @Copilot in #1232
Add vllm v1 mooncake benchmark and launch guide by @Azure-Tang in #1223
[store] zero copy for get_tensor() and batch_get_tensor() by @zxpdemonio in #1192
make para more clear by @XucSh in #1239
Fix missing error handling for cuPointerGetAttribute call by @fzyzcjy in #1241
Build cuMem based allocator when disabling peermem by @fzyzcjy in #1244
Support cuMem allocator when Fabric is unsupported at runtime by @fzyzcjy in #1245
[Fix] Fix broken link in doc by @Azure-Tang in #1240
[Feature] Support H20 intraNode nvlink by introducing fallback mechanism to leverage cudaIPC by @TTThanos in #1234
feat(rdma): add parallel memory region registration support by @staryxchen in #1238
[Doc] Fix typo in the tutorial by @tianrenz2 in #1247
Fix error when peermem is disabled caused by multi threading by @fzyzcjy in #1246
[EP] Implement elastic scaling up by @UNIDY2002 in #1173
[Doc] update toc item of ep-backend by @UNIDY2002 in #1252
[Misc] Update CODEOWNERS for mooncake-ep by @UNIDY2002 in #1253
[EP] Implement send/recv by @UNIDY2002 in #1236
[CI] Force TCP for Mooncake EP Backend tests by @UNIDY2002 in #1255
[Store] Decouple master from transfer_engine dependencies by @00fish0 in #1233
[TE] Add TENT codebase to main (Phase 1: structural import) by @alogfans in #1213
[CI] Disable EP's test_mooncake_backend_p2p_cpu in CI workflow by @UNIDY2002 in #1256
[Doc] add more mooncake store APIs doc by @stmatengss in #1237
add MC_FORCE_HCA environment variable to force use rdma by @baymaxhuang in #1259
[CI] Disable certain tests in CI configuration by @UNIDY2002 in #1263
[Doc] Add update for RBG + SGLang HiCache integration by @stmatengss in #1264
[doc] Restruct doc about vllm support. by @Azure-Tang in #1275
[CI] Add retry mechanism to handle GitHub API rate limit in test-sglang-integration job by @luketong777 in #1273
[store] Prefer local segment when get_buffer/get_into by @zxpdemonio in #1258
[store] add async api by @XucSh in #1265
[TE] feat:ascend direct transport support async transfer by @ascend-direct-dev in #1274
Bump version to 0.3.8 in pyproject.toml by @ShangmingCai in #1285
[CI] Try to reduce disk usage during release build by @UNIDY2002 in #1287
Remove Python 3.9 from release workflow by @ShangmingCai in #1290

New Contributors

@iBenzene made their first contribution in #966
@zhuxinjie-nz made their first contribution in #968
@yeahdongcn made their first contribution in #973
@yejj710 made their first contribution in #976
@cocktail828 made their first contribution in #1030
@flying-x made their first contribution in #1057
@zhyncs made their first contribution in #1079
@Vincent-Bo-ali made their first contribution in #1028
@ZeldaHuang made their first contribution in #1080
@ZechaoZhang-beta made their first contribution in #1045
@Azure-Tang made their first contribution in #1129
@whybeyoung made their first contribution in #1142
@SpecterCipher made their first contribution in #1075
@maheshrbapatu made their first contribution in #1140
@amd-arozanov made their first contribution in #1154
@zxpdemonio made their first contribution in #1148
@yafengio made their first contribution in #1187
@MingYang119 made their first contribution in #1194
@Keithwwa made their first contribution in #1153
@1998zxn made their first contribution in #1076
@luketong777 made their first contribution in #1181
@zhangzuo21 made their first contribution in #1185
@TTThanos made their first contribution in #1234
@tianrenz2 made their first contribution in #1247
@00fish0 made their first contribution in #1233
@baymaxhuang made their first contribution in #1259

Full Changelog: v0.3.7...v0.3.8

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v0.3.8

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

What's Changed

New Contributors

Contributors

Uh oh!