Skip to content

Fix get available memory to walk cgroup v2 ancestor chain#1824

Merged
preinlein merged 1 commit intomainfrom
paul.reinlein/fix-get-available-memory
Mar 19, 2026
Merged

Fix get available memory to walk cgroup v2 ancestor chain#1824
preinlein merged 1 commit intomainfrom
paul.reinlein/fix-get-available-memory

Conversation

@preinlein
Copy link
Contributor

@preinlein preinlein commented Mar 11, 2026

What does this PR do?

Rewrites get_available_memory() to walk the cgroup v2 ancestor chain from the process-specific path upward, returning the tightest memory.max limit. The previous implementation only checked the cgroup root, which returns "max" when the container has its own cgroup namespace — causing lading to use u64::MAX instead of the actual container memory limit.

Motivation

In cgroup v2 environments where the container has its own cgroup namespace (common in Docker/Kubernetes), /sys/fs/cgroup/memory.max at the root reads "max" even though a real limit exists at the process-specific cgroup path. This caused lading to believe it had unlimited memory.

Related issues

N/A

Additional Notes

  • Pins CI runners from ubuntu-latest to ubuntu-24.04 to speed up CI times.
  • Introduces an injectable file reader (get_available_memory_with) for deterministic unit testing of cgroup parsing logic.
  • Walks the full ancestor chain and takes the minimum limit, matching how the kernel enforces hierarchical memory limits.

Copy link
Contributor Author

preinlein commented Mar 11, 2026

@preinlein preinlein changed the title Fix get available memory to work with cgroup v1 (DO NOT REVIEW) Fix get available memory to work with cgroup v1 Mar 11, 2026
@preinlein preinlein force-pushed the paul.reinlein/fix-get-available-memory branch 2 times, most recently from d08954d to 808f116 Compare March 11, 2026 18:00
@preinlein preinlein force-pushed the paul.reinlein/add-smp-cli-as-dependency branch from ae17f70 to 9fd4e7c Compare March 16, 2026 16:59
@preinlein preinlein force-pushed the paul.reinlein/fix-get-available-memory branch from 808f116 to 3d782ef Compare March 16, 2026 16:59
copy-to-ecr:
name: Copy ${{ matrix.arch }} to ECR
needs: build
runs-on: ubuntu-latest
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: This reduced the wait time for GH runners from ~15 min to seconds.

None
}

fn get_available_memory_with<F>(read_file: F) -> Byte
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could see an argument for an info log in this function for what it resolved from where. Let me know if that's something you find desirable.

@preinlein preinlein changed the title (DO NOT REVIEW) Fix get available memory to work with cgroup v1 Fix get available memory to work with cgroup v1 Mar 16, 2026
@preinlein
Copy link
Contributor Author

Container build errors are fixed in #1827

@preinlein preinlein marked this pull request as ready for review March 16, 2026 17:18
@preinlein preinlein requested a review from a team as a code owner March 16, 2026 17:18
Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 3d782ef017

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

@preinlein preinlein force-pushed the paul.reinlein/add-smp-cli-as-dependency branch from 9fd4e7c to c81ae74 Compare March 18, 2026 11:26
@preinlein preinlein force-pushed the paul.reinlein/fix-get-available-memory branch from 3d782ef to cc851e7 Compare March 18, 2026 11:26
Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: cc851e7dbd

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

@preinlein preinlein force-pushed the paul.reinlein/add-smp-cli-as-dependency branch from c81ae74 to cc4fe22 Compare March 18, 2026 14:01
@preinlein preinlein force-pushed the paul.reinlein/fix-get-available-memory branch from cc851e7 to a479460 Compare March 18, 2026 14:01
@preinlein
Copy link
Contributor Author

I need to make a fresh build and test this further. Codex kept finding cgroup edge cases so I want to at least test our current setup (lading as is and lading as a target).

@preinlein preinlein force-pushed the paul.reinlein/fix-get-available-memory branch from a479460 to db73f9e Compare March 18, 2026 14:40
@GeorgeHahn
Copy link
Contributor

I think the code looks good, but I'm hesitant to take on the complexity of cgroupv1 support. What environment needs this? Can we switch it to cgroupv2 instead?

@preinlein
Copy link
Contributor Author

I think the code looks good, but I'm hesitant to take on the complexity of cgroupv1 support. What environment needs this? Can we switch it to cgroupv2 instead?

Let me run a test. lading as target was misreporting memory but that shouldn't be a cgroupv1 issue. Claude just detected it as a possibility. I'm with you that if we don't need it, we shouldn't have it.

I'll drop it and run a test.

@preinlein
Copy link
Contributor Author

I think the code looks good, but I'm hesitant to take on the complexity of cgroupv1 support. What environment needs this? Can we switch it to cgroupv2 instead?

Can confirm, it was just our cgroupv2 implementation was wrong.

Here are the logs with a fresh lading without any of the cgroupv1 stuff but with the changes to the cgroupv2: https://app.datadoghq.com/logs?query=job_id%3Af798fb29-9ba6-4bc7-9062-9248e739ff56%20%22amount%20of%20memory%22&agg_m=count&agg_m_source=base&agg_t=count&cols=host%2Cexperiment&messageDisplay=inline&refresh_mode=paused&storage=hot&stream_sort=time%2Cdesc&viz=stream&from_ts=1773919188864&to_ts=1773922788864&live=false

We can see that the memory reported is exactly what we expect.

@preinlein preinlein force-pushed the paul.reinlein/fix-get-available-memory branch from db73f9e to 2619756 Compare March 19, 2026 12:23
@preinlein preinlein changed the title Fix get available memory to work with cgroup v1 Fix get available memory to walk cgroup v2 ancestor chain Mar 19, 2026
@preinlein preinlein force-pushed the paul.reinlein/fix-get-available-memory branch from 2619756 to b573bc9 Compare March 19, 2026 12:26
@preinlein preinlein force-pushed the paul.reinlein/add-smp-cli-as-dependency branch from cc4fe22 to 6927271 Compare March 19, 2026 12:26
Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: b573bc91e4

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

@preinlein preinlein force-pushed the paul.reinlein/fix-get-available-memory branch from b573bc9 to 00cc251 Compare March 19, 2026 12:48
Copy link
Contributor Author

preinlein commented Mar 19, 2026

Merge activity

  • Mar 19, 1:25 PM UTC: A user started a stack merge that includes this pull request via Graphite.
  • Mar 19, 1:27 PM UTC: Graphite rebased this pull request as part of a merge.
  • Mar 19, 2:03 PM UTC: @preinlein merged this pull request with Graphite.

@preinlein preinlein changed the base branch from paul.reinlein/add-smp-cli-as-dependency to graphite-base/1824 March 19, 2026 13:25
@preinlein preinlein changed the base branch from graphite-base/1824 to main March 19, 2026 13:25
@preinlein preinlein force-pushed the paul.reinlein/fix-get-available-memory branch from 00cc251 to 2886073 Compare March 19, 2026 13:26
@preinlein preinlein force-pushed the paul.reinlein/fix-get-available-memory branch from 2886073 to f24286f Compare March 19, 2026 13:55
@preinlein preinlein merged commit 1085887 into main Mar 19, 2026
35 of 36 checks passed
@preinlein preinlein deleted the paul.reinlein/fix-get-available-memory branch March 19, 2026 14:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants