Skip to content

Commit 1a8903c

Browse files
rapsealkclaude
andcommitted
refactor(agent): Trim sysfs-first cleanup nits
- Extend changes/11223.enhance.md with a short note that block-I/O readings may also step down on cgroup v1 hosts (the sysfs path reads blkio.throttle.io_service_bytes; the API path reads io_service_bytes_recursive). - Drop TestWarnCgroupFallbackOnce::test_evicts_beyond_limit — it was re-verifying cachetools.LRUCache eviction, not project logic. test_deduplicates_per_container already covers the project contract. Refs #11220 Refs #11223 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1 parent 99d5145 commit 1a8903c

2 files changed

Lines changed: 2 additions & 23 deletions

File tree

changes/11223.enhance.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,4 @@
11
Default to cgroup (sysfs) stat collection on native Linux hosts, falling back to the Docker API only on linuxkit or read failure.
22
Note: reported container memory usage may step down on hosts previously using `stats-type: docker`, because sysfs excludes inactive file cache (matching `docker stats`).
3-
Dashboards or autoscaling thresholds tuned to the old (higher) values should be re-evaluated after upgrade.
3+
Block-I/O readings may also shift on cgroup v1 hosts, because the sysfs path reads `blkio.throttle.io_service_bytes` while the Docker API path reads `blkio_stats.io_service_bytes_recursive` (which sums across nested cgroups).
4+
Dashboards or autoscaling thresholds tuned to the old values should be re-evaluated after upgrade.

tests/unit/agent/test_docker_intrinsic.py

Lines changed: 0 additions & 22 deletions
Original file line numberDiff line numberDiff line change
@@ -352,28 +352,6 @@ def test_deduplicates_per_container(
352352
warn_records = [r for r in caplog.records if r.levelname == "WARNING"]
353353
assert len(warn_records) == 1
354354

355-
def test_evicts_beyond_limit(
356-
self,
357-
caplog: pytest.LogCaptureFixture,
358-
) -> None:
359-
"""When the bounded cache overflows, the oldest entry is evicted and a
360-
previously-seen container may warn again."""
361-
cap = intrinsic._CGROUP_FALLBACK_WARN_CACHE_SIZE
362-
first_cid = "first_container"
363-
364-
with caplog.at_level("WARNING", logger="ai.backend.agent.docker.intrinsic"):
365-
# First warn for `first_cid`.
366-
_warn_cgroup_fallback_once("CPUPlugin", first_cid)
367-
# Fill the cache with `cap` distinct new entries to evict `first_cid`.
368-
for i in range(cap):
369-
_warn_cgroup_fallback_once("CPUPlugin", f"filler_{i}")
370-
# `first_cid` should have been evicted and now warn again.
371-
_warn_cgroup_fallback_once("CPUPlugin", first_cid)
372-
373-
warn_records = [r for r in caplog.records if r.levelname == "WARNING"]
374-
# 1 (first) + cap (fillers) + 1 (re-warn of first) = cap + 2
375-
assert len(warn_records) == cap + 2
376-
377355

378356
class TestMemoryPluginContainerPidValidation(BaseDockerIntrinsicTest):
379357
"""Tests for container PID validation before reading /proc/[pid]/net/dev."""

0 commit comments

Comments
 (0)