Skip to content

[data_storage] enhance DataStorageBackend::Create interface#58

Open
wangxiyu191 wants to merge 1 commit intomainfrom
enhance/better-storage-interface
Open

[data_storage] enhance DataStorageBackend::Create interface#58
wangxiyu191 wants to merge 1 commit intomainfrom
enhance/better-storage-interface

Conversation

@wangxiyu191
Copy link
Collaborator

@wangxiyu191 wangxiyu191 commented Mar 16, 2026

背景

当前NFS等backend不同 spec 的 key 会混入同一文件(同size的spec合并分配导致),会让多进程同时写一个文件。为解决该问题做如下重构,将Create分批自由度完全下放给Backend。

核心重构:

新增 SpecBlockKeys、CreateBlocksRequest、SpecCreateResult 数据结构,Create 参数和返回值从一维改为按 spec 维度的二维结构

各后端(Hf3fs、Nfs、Mooncake 等)完成了新接口适配

主要改动点:

Hf3fs / NfsBackend:按 spec_.key_count_per_file() 批大小合并同 spec 内的 block 到同一文件,通过 blkid 区分,保证不同 spec 的 key 不混入同一文件
MooncakeBackend:每个 key 独立生成 URI,不做批文件合并
DataStorageManager:统一在返回 URI 上填充 hostName,供后续 Meta 存储使用
CacheManager::GenWriteLocation:重写创建逻辑,支持按 group 过滤 spec,并在任何 spec 结果异常时对所有已分配 URI 做回滚清理,避免脏数据
单元测试:全面更新,覆盖了单/多 key、单/多 spec、空请求、分批行为及 URI 拼接等场景,包含 MultipleSpecsNotMixed 新测试用例

Background

Currently, for backends such as NFS, keys with different specifications (specs) may be intermingled within the same file (a consequence of merging and allocating specs of identical size), leading to scenarios where multiple processes attempt to write to the same file simultaneously. To resolve this issue, the following refactoring has been implemented: the full discretion regarding the batching of Create operations has been completely delegated to the Backend.

Core Refactoring:

Introduced new data structures: SpecBlockKeys, CreateBlocksRequest, and SpecCreateResult. The parameters and return values ​​for the Create operation have been restructured from a one-dimensional format to a two-dimensional structure organized by spec dimensions.

All backend modules (Hf3fs, Nfs, Mooncake, etc.) have been updated to adapt to the new interfaces.

Key Changes:

Hf3fs / NfsBackend: Blocks belonging to the same spec are batched and consolidated into a single file, based on the batch size defined by spec_.key_count_per_file(). Blocks are distinguished via their blkid, ensuring that keys from different specs are never intermingled within the same file.
MooncakeBackend: A unique URI is generated independently for each individual key; no batch file consolidation is performed.
DataStorageManager: Standardized the process of populating the hostName field within returned URIs, facilitating subsequent usage by the Meta storage layer.
CacheManager::GenWriteLocation: The creation logic has been rewritten to support filtering specs by group. Additionally, if an error occurs during the processing of any spec, a rollback and cleanup operation is triggered to revoke all previously allocated URIs, thereby preventing the generation of "dirty data."
Unit Tests: Comprehensively updated to cover various scenarios, including single/multiple keys, single/multiple specs, empty requests, batching behaviors, and URI construction. This update includes a new test case specifically designed to verify that MultipleSpecsNotMixed.

Copy link

@qoderai qoderai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👋 Review Summary

整体改造思路清晰:通过 CreateBlocksRequest/SpecBlockKeys/SpecCreateResult 把原来一维的 Create 接口升级为按 spec 维度的二维结构,Hf3fs/Nfs/Mooncake 以及 DataStorageManager / CacheManager 的适配也基本做到了一致、可维护。

🛡️ Key Risks & Issues

目前没有在实现中发现明显的逻辑错误或数据一致性问题:

  • CacheManager::GenWriteLocation 侧构造 CreateBlocksRequest 时,block_keys 与 original_key_indices 的映射保持了一致,位置与数量在当前实现下是自洽的(kv_cache_manager/manager/cache_manager.cc:701–727)。
  • DataStorageBackend::Create 返回值(vector)在 Hf3fs/Nfs/Mooncake 的实现中都遵守了“按 spec 顺序一一对应、结果长度与 block_keys 长度一致”的约定,CacheManager 也有防御性的 size 校验和失败回滚逻辑(hf3fs_backend.cc:64–108, nfs_backend.cc:49–89, mooncake_backend.cc:84–105, cache_manager.cc:751–812)。
  • DataStorageManager::Create 统一在 EC_OK 的 URI 上填充 hostName,并在 metrics 中统计总 key 数,行为与旧接口相比更规范(kv_cache_manager/data_storage/data_storage_manager.cc:189–210)。

整体来看,本次改造在接口设计和数据流上是协调一致的,没有发现会直接导致“脏数据”或 URI 错配的明显缺陷。

🧪 Verification Advice

建议在后续压测 / 联调阶段重点关注:

  • 多 spec + location_spec_group_names 过滤场景下,某些 key 只落到部分 spec 的组合路径,验证生成的 CreateBlocksRequest 与最终 CacheLocation 的 spec 集合是否符合预期。
  • 人为注入后端 Create 层面的部分失败(某个 spec 某几个 key 返回非 EC_OK),确认 CacheManager 能正确触发回滚,并且 DataStorageManager::Delete 收到的 URI 集合与已成功创建的完全一致。
  • 针对 MooncakeBackend::Create 增补单元测试或联调用例,覆盖单 spec / 多 spec / 空请求等情况,确保 key、size 参数格式与上层调用方预期一致。

💡 Thoughts & Suggestions

  • 当前通过 SpecBlockKeys + original_key_indices 做 index 映射的方式较为通用,后续如果有更多 group/filter 逻辑也可以在这层扩展,但建议在接口注释中明确 block_keys.size() 与 original_key_indices.size() 必须相等的约束,以避免未来后端实现者误用。
  • CacheManager::GenWriteLocation 中对 size 不匹配和 EC 非 OK 的处理已经比较防御性,但可以考虑在日志中增加 instance_id/spec_name 维度的信息,方便线上排查某个 spec 或后端行为异常的问题。

🤖 Generated by QoderView workflow run

@github-actions github-actions bot added the ai reviewed AI has reviewed this PR label Mar 16, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ai reviewed AI has reviewed this PR

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant