-
Notifications
You must be signed in to change notification settings - Fork 272
gcs: do not trigger container shutdown when signaling init process #2538
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
pkg/securitypolicy/framework.rego
Outdated
signal_ok(signals) { | ||
count(signals) == 0 | ||
input.isInitProcess | ||
input.signal in [9, 15] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is up for debate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
of course 😄 it's just when running tests, I realized that we don't have any "default" kill signals, and my change broke stop container/pod.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can have a api_version
based check - if policy is of an old version, we do this default allow (if it's a sigterm/sigkill - based on old behaviour), otherwise for a new policy we expect the tooling to put in the correct allowed signals list.
internal/guest/runtime/hcsv2/uvm.go
Outdated
return err | ||
} | ||
|
||
// Don't allow signalProcessV2 to route around container shutdown policy. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe I am missing something, but I don't think this path gates anything. The only difference EnforceShutdownContainerPolicy passing or failing will make is an error message. If it passes it will not do a shutdown and if it fails it will not do a shutdown.
I will claim that if the signal is allowed by the user and that signal is going to cause a shutdown, then they are good with being shutdown. Maybe we could treat SIGKILL as shutdown. TBD with some experiments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree. the original code didn't quite make sense to me either, since the shutdown policy doesn't enforce much either. it just removes the container from metadata store? Are you suggesting to just not call to shutdown policy here? and simply return on L695?
Tested with signals_test.cpp - now a child process of the init process no longer gets a SIGTERM so this fix is good:
old behaviour - child gets SIGTERM and is killed:
|
a048664
to
8ffbc6f
Compare
pkg/securitypolicy/framework.rego
Outdated
signal_ok(signals) { | ||
count(signals) == 0 | ||
input.isInitProcess | ||
semver.compare(data.api.version, "0.11.0") < 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
data.api.version
would always be the version shipped with the GCS I think, the policy api version would be policy_api_version
However, that still doesn't account for fragments having different version. Unfortunately by the time we evaluate the rules we wouldn't know which fragment the container came from (and thus whether it is "old" or not). I think probably the way to do this is to update check_container
to, instead of using raw_container.signals
straightaway, delegate it to another rule that adds the two "default" signals into the array if the framework_version (instead of api_version) is old (fragments has no api_version, only framework_version, main policy has both).
Also since we're not bumping the version in this PR this probably needs to be a <=
instead of <
@matajoh do you agree?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After discussion with @KenGordon and @matajoh , I've made the suggested implementation:
micromaomao@5309018
Also there is another commit to fix some error reasoning bug:
micromaomao@63f144a
I also tested that the behaviour of signaling an exec process is not changed. This can be triggered with crictl exec -s -t <timeout>
:
PS C:\lcow_signals> crictl exec --timeout 1 -s 3873bfc939e24 /bin/sleep infinity
time="2025-10-15T13:52:29Z" level=fatal msg="execing command in container synchronously: rpc error: code = Unknown desc = failed to exec in container: failed to kill exec \"91d6297211fcf000d755a94cda2c7c2e7b86d46ccb0f26cf062038384f902eee\": guest RPC failure: policyDecision< {"decision":"deny","input":{"argList":["/bin/sleep","infinity"],"containerID":"3873bfc939e2415892b5b74a7b1dbade0f7222e266df43df85968ddda59be56e","isInitProcess":false,"rule":"signal_container_process","signal":9},"reason":{"errors":["target isn't allowed to receive the signal"]}} >policyDecision: unknown"
(denial message base64 decoded)
We will need to update the tooling to bump the framework_version and add the default signals to generated policies (only to the container, not for exec process - currently trying to kill an exec process is already denied unless policy allows)
@anmaxvl Please feel free to review and take these 2 changes :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
they make sense. feel free to author them in a separate PR. I removed the rego changes to implement the immediate fix.
With this change I think we still have two (or three?) ways to kill a (specific, individual) container - via signalProcess or killContainerV2 or shutdownContainerV2, the latter two doesn't check the (I could not find a host-side command to send kill/shutdownContainerV2 but in principle the host could send it) |
… in old policies We used to allow SIGTERM/SIGKILL the container init process even if the container's signals list is empty due to a bug fixed in microsoft#2538. However, because our tooling has been generating policies with an empty signals list, we need to special case this for old policies to maintain backwards compatibility. Update framework.rego to have SIGTERM and SIGKILL as default kill signals for init process for framework API versions "0.4.1" and below. Newer policies must explicitly have these signals present, otherwise sending signal will be denied. Signed-off-by: Tingmao Wang <[email protected]> Co-authored-by: Maksim An <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good now for the non confidential case. Tingmao has a PR for the confidential case coming.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to remove the "all" functionality from Kill
it is not supported in runc
and we should not support it either.
When implementing signal container process enforcement policy we introduced a bug, where instead of signalling just the container init process we ended up sending signals (SIGTERM or SIGKILL) to all processes running inside a container (by invoking `runc kill --all`). `container.Kill` no longer sends signals to all container processes. This results in an unpleasant behavior, where the init process could be handling (e.g. ignoring) SIGTERM, where as other processes inside container don't. This PR makes a change to the order in which the signal container policy is enforced: - always call `EnforceSignalContainerProcessPolicy` before sending any signals. Otherwise, this looks like a bug, since we would never call `EnforceSignalContainerProcessPolicy` with `signalingInitProcess == true` for `SIGTERM` and `SIGKILL` and potentially bypassing policies, which do not allow `SIGTERM` or `SIGKILL` to be sent to the init process. - no longer call `ShutdownContainer` and instead revert back to calling `process.Kill`. Signed-off-by: Maksim An <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We spoke offline. This change is ok for now. We should do a follow up change where we modify the Kill opts on the Task/Cow interfaces to forward the all
intent to the guest over the bridge and use that to determine setting the --all
flag in runc case.
When implementing signal container process enforcement policy we
introduced a bug, where instead of signalling just the container
init process we ended up sending signals (SIGTERM or SIGKILL) to
all processes running inside a container (by invoking
runc kill --all
).This results in an unpleasant behavior, where the init process
could be handling (e.g. ignoring) SIGTERM, where as other processes
inside container don't.
This PR makes a change to the order in which the signal container
policy is enforced:
EnforceSignalContainerProcessPolicy
before sendingany signals. Otherwise, this looks like a bug, since we would
never call
EnforceSignalContainerProcessPolicy
withsignalingInitProcess == true
forSIGTERM
andSIGKILL
andpotentially bypassing policies, which do not allow
SIGTERM
orSIGKILL
to be sent to the init process.ShutdownContainer
and instead revert back tocalling
process.Kill
.