Skip to content

Return usernsmode=private for autons from inspect#27998

Draft
vmsh0 wants to merge 3 commits into
containers:mainfrom
vmsh0:usernsmode_private_for_autons
Draft

Return usernsmode=private for autons from inspect#27998
vmsh0 wants to merge 3 commits into
containers:mainfrom
vmsh0:usernsmode_private_for_autons

Conversation

@vmsh0
Copy link
Copy Markdown
Contributor

@vmsh0 vmsh0 commented Jan 30, 2026

When a container is created (internal state configured) using podman create or the /create API endpoint, and the container is configured with userns=auto, the Spec structure is not populated with the future user namespace, as the auto case is not handled in specgen.SetupUserNS.

Instead, auto user ns is handled while adding shared namespaces when actual OCI initialization happens.

This patch introduces an additional check in the inspect operation, which will return the value 'private' for 'usernsmode' in this situation, to reflect that when the container is started it will have a private user namespace.

Please let me know if this change makes sense to you. Needs some additional thoughts for testing & documentation.

Checklist

Ensure you have completed the following checklist for your pull request to be reviewed:

  • Certify you wrote the patch or otherwise have the right to pass it on as an open-source patch by signing all
    commits. (git commit -s). (If needed, use git commit -s --amend). The author email must match
    the sign-off email address. See CONTRIBUTING.md
    for more information.
  • Referenced issues using Fixes: #00000 in commit message (if applicable)
  • Tests have been added/updated (or no tests are needed)
  • Documentation has been updated (or no documentation changes are needed)
  • All commits pass make validatepr (format/lint checks)
  • Release note entered in the section below (or None if no user-facing changes)

Does this PR introduce a user-facing change?

return usernsmode=private when inspecting configured (podman create) containers with userns=auto

@vmsh0 vmsh0 force-pushed the usernsmode_private_for_autons branch from 9672bf9 to 41437b4 Compare January 30, 2026 17:56
@packit-as-a-service
Copy link
Copy Markdown

[NON-BLOCKING] Packit jobs failed. @containers/packit-build please check. Everyone else, feel free to ignore.

@TomSweeneyRedHat
Copy link
Copy Markdown
Member

The change LGTM (less docs and tests), but would lean heavily on @mheon or @Luap99 's thoughts.

@vmsh0
Copy link
Copy Markdown
Contributor Author

vmsh0 commented Feb 13, 2026

Don't mean to rush anyone, just wondering whether any further input is required from me to bring this forward

Copy link
Copy Markdown
Member

@giuseppe giuseppe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@giuseppe
Copy link
Copy Markdown
Member

do we need a similar change for keep-id and nomap?

@vmsh0
Copy link
Copy Markdown
Contributor Author

vmsh0 commented Feb 14, 2026

do we need a similar change for keep-id and nomap?

Both keep-id and nomap result in a private user namespace being appended to the OCI Spec structure during the create operation. auto is different, in that it doesn't generate a namespace specification until the container is started. This is why auto needs this special case.

E.g.:

➜ pm create --userns=keep-id docker.io/library/busybox:latest bash
f90ce67f1a410552fdd6941def46d9eb7f843ae65e46415c4832e1ec63b19058

➜ pm inspect f9 | jq ".[0].HostConfig.UsernsMode"
"private"  # ok!

➜ pm create --userns=nomap docker.io/library/busybox:latest bash
88790dc83a49872ebbd100c38bfd777e5a06492093510743bfe39d88ceee8f0a

➜ pm inspect 88 | jq ".[0].HostConfig.UsernsMode"
"private"  # ok!

(pm is just an alias to a podman daemon attached to the debugger)

@giuseppe
Copy link
Copy Markdown
Member

thanks, could you add a test?

@vmsh0 vmsh0 force-pushed the usernsmode_private_for_autons branch 2 times, most recently from ff22bd7 to 4c1d7bf Compare February 16, 2026 14:26
@vmsh0
Copy link
Copy Markdown
Contributor Author

vmsh0 commented Feb 16, 2026

Not sure why those two tests are failing, might be flakes? They are passing on my machine

Copy link
Copy Markdown
Member

@Honny1 Honny1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! I checked your test, and it passed. Can you please rebase on main?

@vmsh0 vmsh0 force-pushed the usernsmode_private_for_autons branch from 4c1d7bf to 65668af Compare February 19, 2026 13:27
@github-actions github-actions Bot added machine kind/api-change Change to remote API; merits scrutiny governance labels Feb 19, 2026
@mheon
Copy link
Copy Markdown
Member

mheon commented Feb 19, 2026

I think you messed up your rebase a bit, got a lot of extra commits in.

Changes LGTM for reference.

@vmsh0
Copy link
Copy Markdown
Contributor Author

vmsh0 commented Feb 19, 2026

It looks like I did, let me check what happened :)

@vmsh0 vmsh0 force-pushed the usernsmode_private_for_autons branch from 65668af to 75affa5 Compare February 19, 2026 13:43
@vmsh0
Copy link
Copy Markdown
Contributor Author

vmsh0 commented Feb 19, 2026

Ok, fixed it. I think some tags got added to the PR as a result of my mistake, I'm guessing they should be removed. Sorry!

@Honny1 Honny1 removed machine kind/api-change Change to remote API; merits scrutiny governance labels Feb 19, 2026
@Honny1
Copy link
Copy Markdown
Member

Honny1 commented Feb 19, 2026

I was able to track the source of the error down to the go-systemd library. The GetHandle function tries to get a handle to a shared library (.so) by attempting to access the names specified in libs, returning the first one that successfully opens. Is it possible that the APIv2 test image is missing something? @lsm5

@vmsh0
Copy link
Copy Markdown
Contributor Author

vmsh0 commented Feb 25, 2026

Let me know when/if you need another rebase :)

@mheon
Copy link
Copy Markdown
Member

mheon commented Feb 25, 2026

I think the APIv2 failures are flakes, restarted. I'll enable auto-merge.

@mheon mheon enabled auto-merge February 25, 2026 16:06
@Honny1
Copy link
Copy Markdown
Member

Honny1 commented Mar 11, 2026

Rerun failed tests.

auto-merge was automatically disabled March 14, 2026 15:11

Head branch was pushed to by a user without write access

@vmsh0 vmsh0 force-pushed the usernsmode_private_for_autons branch 2 times, most recently from 968fd10 to d019f4b Compare April 30, 2026 10:32
@Honny1
Copy link
Copy Markdown
Member

Honny1 commented Apr 30, 2026

@vmsh0 You don't need to push dummy commits. You can rerun tests form cirrus ui. Also, failures of packit jobs are usually not caused by your change. Please drop the commit, and then we can merge.

@vmsh0 vmsh0 force-pushed the usernsmode_private_for_autons branch from d019f4b to d4c557b Compare April 30, 2026 12:18
@vmsh0
Copy link
Copy Markdown
Contributor Author

vmsh0 commented Apr 30, 2026

@vmsh0 You don't need to push dummy commits. You can rerun tests form cirrus ui. Also, failures of packit jobs are usually not caused by your change. Please drop the commit, and then we can merge.

Sorry, I didn't realize I had the necessary permissions to rerun tests directly. Let's wait for everything to rerun once again and let's see if the API tests pass. I will try to rerun any failing tests until we see full success.

@Honny1
Copy link
Copy Markdown
Member

Honny1 commented Apr 30, 2026

I am not sure about the failure: https://cirrus-ci.com/task/4799819752931328, but it appears unrelated.

PTAL @containers/podman-maintainers @mheon @timcoding1988 @lsm5

@vmsh0
Copy link
Copy Markdown
Contributor Author

vmsh0 commented Apr 30, 2026

I am not sure about the failure: https://cirrus-ci.com/task/4799819752931328, but it appears unrelated.

PTAL @containers/podman-maintainers @mheon @timcoding1988 @lsm5

That failure has been there since day 1 of this PR. It does look unrelated to me as well, but the fact is that when I pushed the dummy commit earlier today the API tests passed. On the other hand, I really don't see how my changes are affecting those tests. Could it be my new tests in 20-containers.at that leave behind some kind of bad state, as opposed to the change itself?

It's especially unclear to me how my change could cause the library to fail to load. The error in the API tests is "unable to open a handle to the library".

@Honny1
Copy link
Copy Markdown
Member

Honny1 commented Apr 30, 2026

@vmsh0 I agree. Let's wait for what other maintainers think.

@mtrmac
Copy link
Copy Markdown
Contributor

mtrmac commented Apr 30, 2026

unable to open a handle to the library (should be improved to include the name of the library and) seems to refer to a dlopen failing to find a .so library, and that is somewhere in go-systemd.

I agree that it seems unrelated to this PR just at a first glance, but I don’t think that’s the primary thing to consider here. We should not be in the habit of ignoring deterministic failures (Ideally we wouldn’t even accept existence of flakes, but that’s a higher bar).

  • If we have universally broken CI, that’s a high-priority situation that needs to be fixed; we should never get into the habit of ignoring test failures.
  • … but, looking at recent commit history, and recent open PRs, the test seems to be passing there if I’m checking correctly?! That would indicate that there is indeed something about this PR that does cause it, or at least exposes a pre-existing failure.

@vmsh0
Copy link
Copy Markdown
Contributor Author

vmsh0 commented Apr 30, 2026

unable to open a handle to the library (should be improved to include the name of the library and) seems to refer to a dlopen failing to find a .so library, and that is somewhere in go-systemd.

I agree that it seems unrelated to this PR just at a first glance, but I don’t think that’s the primary thing to consider here. We should not be in the habit of ignoring deterministic failures (Ideally we wouldn’t even accept existence of flakes, but that’s a higher bar).

* If we have universally broken CI, that’s a high-priority situation that needs to be fixed; we should never get into the habit of ignoring test failures.

* … but, looking at recent commit history, and recent open PRs, the test seems to be passing there if I’m checking correctly?! That would indicate that there is indeed _something_ about this PR that does cause it, or at least exposes a pre-existing failure.

What's interesting is that we have the same in #27988, which is a different (but somewhat related) change. Can I play around with the tests? E.g. I would like to force-push a version of this change without the additional tests.

@mtrmac
Copy link
Copy Markdown
Contributor

mtrmac commented Apr 30, 2026

Sure, adding code to help diagnose it would be welcome. Just mark the PR as draft first, please.

@vmsh0 vmsh0 marked this pull request as draft April 30, 2026 16:32
@vmsh0 vmsh0 force-pushed the usernsmode_private_for_autons branch from 7302dcf to 4ba2629 Compare April 30, 2026 16:49
@packit-as-a-service
Copy link
Copy Markdown

Cockpit tests failed for commit de48be9. @martinpitt, @jelly, @mvollmer please check.

@packit-as-a-service
Copy link
Copy Markdown

Cockpit tests failed for commit 6949f84. @martinpitt, @jelly, @mvollmer please check.

@packit-as-a-service
Copy link
Copy Markdown

[NON-BLOCKING] Packit jobs failed. @containers/packit-build please check. Everyone else, feel free to ignore.

@packit-as-a-service
Copy link
Copy Markdown

Cockpit tests failed for commit 14904a8. @martinpitt, @jelly, @mvollmer please check.

@packit-as-a-service
Copy link
Copy Markdown

[NON-BLOCKING] Packit jobs failed. @containers/packit-build please check. Everyone else, feel free to ignore.

@vmsh0 vmsh0 force-pushed the usernsmode_private_for_autons branch from 14904a8 to c04f7e0 Compare May 1, 2026 11:40
@packit-as-a-service
Copy link
Copy Markdown

Cockpit tests failed for commit c04f7e0. @martinpitt, @jelly, @mvollmer please check.

@packit-as-a-service
Copy link
Copy Markdown

[NON-BLOCKING] Packit jobs failed. @containers/packit-build please check. Everyone else, feel free to ignore.

@vmsh0
Copy link
Copy Markdown
Contributor Author

vmsh0 commented May 1, 2026

Sure, adding code to help diagnose it would be welcome. Just mark the PR as draft first, please.

My testing shows that

  1. APIv2 test group 26 fails only when I introduce the additional test in group 20 -- the create operation in 20 is sufficient to trigger the failure in 26
  2. changing the test introduced in group 20 to a simple create, without autons, results in 26 passing
  3. if I remove the additional test completely from the commit, and only introduce the code change, 26 passes

At a first glance, these findings mean that the test setting up the rootless container in group 20 changes something permanently in the testing environment. Perhaps a bug was uncovered here. I will try to investigate by reproducing the same environment in my lab and testing manually, as using your CI to run these experiments doesn't scale very well.

Any ideas of what we might be looking at?

@packit-as-a-service
Copy link
Copy Markdown

Cockpit tests failed for commit d262b7c. @martinpitt, @jelly, @mvollmer please check.

@mheon
Copy link
Copy Markdown
Member

mheon commented May 1, 2026

Does this reproduce with the Podman CLI, instead of the API?

@packit-as-a-service
Copy link
Copy Markdown

[NON-BLOCKING] Packit jobs failed. @containers/packit-build please check. Everyone else, feel free to ignore.

@vmsh0
Copy link
Copy Markdown
Contributor Author

vmsh0 commented May 1, 2026

Tried to repro on my system with:

IMAGE=quay.io/libpod/testimage:20241011
CTR="WaitTestingCtr"

podman create --userns=auto $IMAGE

podman rm -a -f &>/dev/null
podman create --name "${CTR}" "${IMAGE}" sh -c "exit 3"
cid=$(podman inspect --format '{{.Id}}' "${CTR}")

podman wait --condition stopped "${CTR}"

(sleep 1; podman start "${CTR}") & child_pid=$!
sleep 2
podman wait --condition exited "${CTR}"
echo "Wait returned: $?"
echo "Container exit code: $(podman inspect --format '{{.State.ExitCode}}' "${CTR}")"

wait "${child_pid}"

But it works just fine.

Might not be very accurate tho, as the command line doesn't expose the API 1:1. I'll try to reproduce this on a clean system similar to the CI rig next week.

@Honny1
Copy link
Copy Markdown
Member

Honny1 commented May 12, 2026

@vmsh0, do you have an opportunity to test that?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants