-
Notifications
You must be signed in to change notification settings - Fork 5.6k
fix(cli): configure v8 isolate with cgroups-constrained memory limit #29078
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(cli): configure v8 isolate with cgroups-constrained memory limit #29078
Conversation
This change configures V8 isolates to respect memory limits imposed by cgroups on Linux. It adds support for detecting both cgroups v1 and v2 memory limits, enabling Deno to properly adapt to containerized environments with memory constraints. When cgroups information is unavailable or not applicable, it falls back to using the system's total memory as before.
8ed9d12
to
895d941
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR configures V8 isolates to honor memory limits imposed by cgroups on Linux by detecting cgroup v1 and v2 configurations. Key changes include:
- Adding a new Linux-specific logic branch in create_isolate_create_params to use cgroup memory limits.
- Introducing a helper module with functions to parse /proc/self/cgroup and determine the correct memory limit.
- Including tests to validate the cgroup parsing function for both v1 and v2 (and hybrid mode).
Co-authored-by: Copilot <[email protected]> Signed-off-by: Yusuke Tanaka <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
generally seems good, I am curious though if there's some crate for parsing the cgroups instead of manually implementing it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any impact on startup perf? Can we this be feature flagged at compile/runtime instead?
That's a good point, it won't be used in 99% cases - maybe enable it only via env var |
Looks like this crate https://docs.rs/cgroups-rs/0.3.4/cgroups_rs/index.html provides what we want. I'll replace the manual implementation with it.
OK, let's do that |
Will this fix be a possible solution for quarto-dev/quarto-cli#4197? |
@pat-s If the linked issue is about heap size limit (like |
I started working on replacing manual cgroup file parsing with existing libraries (procfs + cgroup-rs), and while reading the doc a thought came to my mind that parsing with these crates would introduce unnecessary overhead because the code of these crates didn't really seem to be written with performance in mind + they just do more jobs than we need. So here's a quick benchmark, where we have:
and compared the performance with hyperfine. Click here to see manual_impl.rsmanual_impl.rsfn get_memory_limit() -> Option<u64> {
let Ok(self_cgroup) = std::fs::read_to_string("/proc/self/cgroup") else {
return None;
};
match parse_self_cgroup(&self_cgroup) {
CgroupVersion::V1 { cgroup_relpath } => {
let limit_path = std::path::Path::new("/sys/fs/cgroup/memory")
.join(cgroup_relpath)
.join("memory.limit_in_bytes");
std::fs::read_to_string(limit_path)
.ok()
.and_then(|s| s.trim().parse::<u64>().ok())
}
CgroupVersion::V2 { cgroup_relpath } => {
let limit_path = std::path::Path::new("/sys/fs/cgroup")
.join(cgroup_relpath)
.join("memory.max");
std::fs::read_to_string(limit_path)
.ok()
.and_then(|s| s.trim().parse::<u64>().ok())
}
CgroupVersion::None => None,
}
}
enum CgroupVersion<'a> {
V1 { cgroup_relpath: &'a str },
V2 { cgroup_relpath: &'a str },
None,
}
fn parse_self_cgroup(self_cgroup_content: &str) -> CgroupVersion<'_> {
// Initialize the cgroup version as None. This will be updated based on the parsed lines.
let mut cgroup_version = CgroupVersion::None;
// Iterate through each line in the cgroup content. Each line represents a cgroup entry.
for line in self_cgroup_content.lines() {
// Split the line into parts using ":" as the delimiter. The format is typically:
// "<hierarchy_id>:<subsystems>:<cgroup_path>"
let split = line.split(":").collect::<Vec<_>>();
match &split[..] {
// If the line specifies "memory" as the subsystem, it indicates cgroup v1 is used
// for memory management. Extract the relative path and update the cgroup version.
[_, "memory", cgroup_v1_relpath] => {
cgroup_version = CgroupVersion::V1 {
cgroup_relpath: cgroup_v1_relpath
.strip_prefix("/")
.unwrap_or(cgroup_v1_relpath),
};
// Break early since v1 explicitly manages memory, and no further checks are needed.
break;
}
// If the line starts with "0::", it indicates cgroup v2 is used. However, in hybrid
// mode, memory might still be managed by v1. Continue checking other lines to confirm.
["0", "", cgroup_v2_relpath] => {
cgroup_version = CgroupVersion::V2 {
cgroup_relpath: cgroup_v2_relpath
.strip_prefix("/")
.unwrap_or(cgroup_v2_relpath),
};
}
_ => {}
}
}
cgroup_version
}
fn main() {
for _ in 0..10_000 {
let limit = get_memory_limit();
println!("Memory limit: {:?}", limit);
}
} Click here to view cgroups_rs.rscgroups_rs.rsuse cgroups_rs::{hierarchies, MaxValue};
use cgroups_rs::{memory::MemController, Cgroup};
use procfs::process::Process;
fn get_memory_limit() -> Option<MaxValue> {
let myself = Process::myself().unwrap();
let cgroups_entries = myself.cgroups().unwrap().0;
let mut maybe_v2_limit = None;
for entry in cgroups_entries {
// V2
if entry.controllers.is_empty() {
let relpath = entry.pathname.trim_start_matches('/');
let hier_v2 = hierarchies::V2::new();
let cg = Cgroup::load(Box::new(hier_v2), relpath);
let mem_controller = cg.controller_of::<MemController>().unwrap();
maybe_v2_limit = mem_controller.get_mem().unwrap().max;
} else if entry.controllers.contains(&"memory".to_string()) {
let relpath = entry.pathname.trim_start_matches('/');
let hier_v1 = hierarchies::V1::new();
let cg = Cgroup::load(Box::new(hier_v1), relpath);
let mem_controller = cg.controller_of::<MemController>().unwrap();
return mem_controller.get_mem().unwrap().max;
}
}
maybe_v2_limit
}
fn main() {
for _ in 0..10_000 {
let limit = get_memory_limit();
println!("Memory limit: {:?}", limit);
}
} Result
So as I had guessed, parsing with the crates brought kind of unacceptable overhead. Let's probably keep the manual implementation as is? |
@magurotuna that seems reasonable, though I would still suggest to only parse cgroups if an env var is present (eg. |
@bartlomieju Yes for sure. I am working on that part now |
|
This is super useful in docker, kubernetes, etc. I think it's a bummer having it behind an environment variable. Can't some quicker lookup be done to decide whether cgroups should be looked into or not? Like checking the presence of some file. |
This change configures V8 isolates to respect memory limits imposed by cgroups on Linux.
It adds support for detecting both cgroups v1 and v2 memory limits, enabling Deno to properly adapt to containerized environments with memory constraints. When cgroups information is unavailable or not applicable, it falls back to using the system's total memory as before.
Closes #29077
Test
For testing, I created a ubuntu VM with 1Gi memory. Within this VM, set up a cgroup with 512Mi memory limit, then ran the following script to see how much heap size limit the V8 isolate had.
Ubuntu 20.04
In this version of ubuntu, hybrid mode is enabled by default.
Create a new cgroup with 512Mi memory limit and run the above script in this cgroup:
This indicates that the isolate was informed of cgroup-constrained memory limit (512Mi) and hence got ~270M heap limit.
Ubuntu 22.04
In this version of ubuntu, cgroup v2 is used.
Run the above script using
systemd-run
:Again the isolate got ~270M heap limit properly.
Note that it should have had bigger heap limit if the entire system memory, i.e. 1Gi, had been passed to V8. In fact, if we run the same script outside the cgroup, it does display larger
heap_size_limit
like below: