[Bugfix] Guard cache hugepage allocation in Kubernetes by dante159753 · Pull Request #965 · ModelEngine-Group/unified-cache-management

dante159753 · 2026-05-20T02:26:17Z

What changed

Added a cache_use_hugepage CacheStore option, defaulting to false.
Kept io_direct independent from explicit hugepage allocation: direct I/O host buffers no longer try MAP_HUGETLB unless cache_use_hugepage: true is set.
When hugepage allocation is disabled, Ascend direct-I/O host buffers use anonymous mmap with MADV_NOHUGEPAGE; when enabled, they retain the existing 1GiB -> 2MiB hugetlb fallback and anonymous mmap with hugepage advice.
Included startup diagnostics around cache buffer allocation and documented the new config in the example and PipelineStore guide.

Why

In Kubernetes deployments, hugetlb/TLB resources can be visible inside a pod while not actually being usable by that pod because the pod cgroup has no usable hugepage quota or the runtime has not mounted/configured hugepages correctly. In that state, UCM's previous default direct-I/O path tried explicit hugetlb allocation during service startup, which could fail or proceed into an unstable allocation path and cause vLLM engine initialization to abort.

The safer default is to avoid explicit hugepage allocation unless the operator opts in after confirming the pod has usable hugepage resources.

Impact

Existing configs keep working without adding the new option.
io_direct: true still enables direct I/O, but does not imply explicit hugetlb allocation by default.
Operators that want the old hugepage behavior can set cache_use_hugepage: true.
Startup logs now make cache buffer allocation size, direct-I/O mode, hugepage mode, and allocation failures easier to diagnose.

Verification

git diff --check
codespell via pre-commit during commit
C++ build/tests were not run locally because cmake is not installed in this workspace.

dante159753 added 2 commits May 15, 2026 16:05

[Bugfix] Add cache buffer allocation diagnostics

606f8ea

[Bugfix] Disable cache hugepage allocation by default

3bfae37

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bugfix] Guard cache hugepage allocation in Kubernetes#965

[Bugfix] Guard cache hugepage allocation in Kubernetes#965
dante159753 wants to merge 2 commits into
ModelEngine-Group:developfrom
dante159753:cache-memory-logs

dante159753 commented May 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

dante159753 commented May 20, 2026

What changed

Why

Impact

Verification

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant