-
Notifications
You must be signed in to change notification settings - Fork 2.2k
libct: setns process recreate cmd object before calling the first start #5066
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
Hmm, the code to create a copy of the cmd is kinda fragile, and I guess what Alan suggested is a better way, i.e.
One more alternative is to check whether CLONE_INTO_CGROUP is available, but I'm afraid the check is going to be too expensive, and caching its result is not very reliable either. |
070b6ed to
4cdc0d6
Compare
Hello! Something like this 4cdc0d6 ? I have made this as a Container method. |
|
@everzakov Have you verified whether this issue actually affects In fact, runc doesn’t call Regarding the Go PR (728642), I think the change might be overly strict for users and could potentially be a breaking change. |
|
In fact, I changed the p.cmd.SysProcAttr.CgroupFD = 100 //int(fd.Fd())lifubang@acmcoder /opt/ubuntu $ sudo ./runc exec test echo hello
hellolifubang@acmcoder /opt/ubuntu $ sudo ./runc --debug exec test echo hello
...
DEBU[0000]libcontainer/process_linux.go:359 libcontainer.(*setnsProcess).prepareCgroupFD() using CLONE_INTO_CGROUP "/sys/fs/cgroup/user.slice/user-1000.slice/test"
... libcontainer.(*setnsProcess).startWithCgroupFD() exec with CLONE_INTO_CGROUP failed: fork/exec /proc/self/fd/6: bad file descriptor; retrying without**
...
hello |
Actually there are some calls cmd.Wait() - https://github.com/opencontainers/runc/blob/v1.4.0/libcontainer/process_linux.go#L152 and https://github.com/opencontainers/runc/blob/v1.4.0/libcontainer/process_linux.go#L524 .
Yeah, the main problem to find this fault in runtime is that all cmd.Wait calls' errors are ignored. Also, it is often preceded by process KILL. So, this error can be found only in tests. You can try to run TestEnter test with your change. The error should be the same as from issue.
I have proposed another go pr 728660 . In my PR it will be possible to call the same os/exec Cmd object if the first try was unsuccessful. Also, Cmd.Wait does not return the error. However, i think 728642 will be merged. |
|
You're right — we shouldn’t reuse the Just a discuss, a simple fix would be to move the retry logic into |
|
Additionally, adding a test case to ensure the patch works correctly would be appreciated. |
Signed-off-by: Efim Verzakov <[email protected]>
4cdc0d6 to
1df1ec2
Compare
Hello! Well, i'm not very familiar with cgroups library. I think the test should be where we pass the non cgroup path and the check fails in the kernel. However, this path should exist . There are a lot funcs to clean and check prefixes . In Create we pass nil paths in cgroup manager so the default /sys/fs/cgroup pass will be used. I think it will be impossible to create non cgroup child directory in /sys/fs/cgroup tree. But we can use factory Load method to get a container . In this method cgroup paths are passed that's why we can set our non cgroup path. So, the idea is to patch the dumped container state, load it with changed cgroup paths and turn off the cgroup function to check directory cgroup magic. The first call will fail because we will pass non cgroup dir. The second call doesn't return an error, and the set ns pid will be added successfuly into the "cgroup". The test logs: |
Currently there is an golang issue golang/go#76746 .
If the cmd.Start is called twice (the first call was unsuccessful) then
Cmd.Wait can return an error (because goroutines' pipes will be closed in the first try).
However, the second cmd.Start can return nil.
After https://go-review.googlesource.com/c/go/+/728642 is merged, then
the second call will return an error. So we need to recreate the Cmd object.
We can not simply copy it
https://go-review.googlesource.com/c/go/+/728642/comments/a837fe92_a5fc5f06 .
Fixes #5060