Skip to content

Return usernsmode=private for autons from inspect#27998

Draft
vmsh0 wants to merge 1 commit into
podman-container-tools:mainfrom
vmsh0:usernsmode_private_for_autons
Draft

Return usernsmode=private for autons from inspect#27998
vmsh0 wants to merge 1 commit into
podman-container-tools:mainfrom
vmsh0:usernsmode_private_for_autons

Conversation

@vmsh0

@vmsh0 vmsh0 commented Jan 30, 2026

Copy link
Copy Markdown
Contributor

When a container is created (internal state configured) using podman create or the /create API endpoint, and the container is configured with userns=auto, the Spec structure is not populated with the future user namespace, as the auto case is not handled in specgen.SetupUserNS.

Instead, auto user ns is handled while adding shared namespaces when actual OCI initialization happens.

This patch introduces an additional check in the inspect operation, which will return the value 'private' for 'usernsmode' in this situation, to reflect that when the container is started it will have a private user namespace.

Please let me know if this change makes sense to you. Needs some additional thoughts for testing & documentation.

Checklist

Ensure you have completed the following checklist for your pull request to be reviewed:

  • Certify you wrote the patch or otherwise have the right to pass it on as an open-source patch by signing all
    commits. (git commit -s). (If needed, use git commit -s --amend). The author email must match
    the sign-off email address. See CONTRIBUTING.md
    for more information.
  • Referenced issues using Fixes: #00000 in commit message (if applicable)
  • Tests have been added/updated (or no tests are needed)
  • Documentation has been updated (or no documentation changes are needed)
  • All commits pass make validatepr (format/lint checks)
  • Release note entered in the section below (or None if no user-facing changes)

Does this PR introduce a user-facing change?

return usernsmode=private when inspecting configured (podman create) containers with userns=auto

@vmsh0 vmsh0 force-pushed the usernsmode_private_for_autons branch from 9672bf9 to 41437b4 Compare January 30, 2026 17:56
@packit-as-a-service

Copy link
Copy Markdown

[NON-BLOCKING] Packit jobs failed. @containers/packit-build please check. Everyone else, feel free to ignore.

@TomSweeneyRedHat

Copy link
Copy Markdown
Contributor

The change LGTM (less docs and tests), but would lean heavily on @mheon or @Luap99 's thoughts.

@vmsh0

vmsh0 commented Feb 13, 2026

Copy link
Copy Markdown
Contributor Author

Don't mean to rush anyone, just wondering whether any further input is required from me to bring this forward

@giuseppe giuseppe left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@giuseppe

Copy link
Copy Markdown
Contributor

do we need a similar change for keep-id and nomap?

@vmsh0

vmsh0 commented Feb 14, 2026

Copy link
Copy Markdown
Contributor Author

do we need a similar change for keep-id and nomap?

Both keep-id and nomap result in a private user namespace being appended to the OCI Spec structure during the create operation. auto is different, in that it doesn't generate a namespace specification until the container is started. This is why auto needs this special case.

E.g.:

➜ pm create --userns=keep-id docker.io/library/busybox:latest bash
f90ce67f1a410552fdd6941def46d9eb7f843ae65e46415c4832e1ec63b19058

➜ pm inspect f9 | jq ".[0].HostConfig.UsernsMode"
"private"  # ok!

➜ pm create --userns=nomap docker.io/library/busybox:latest bash
88790dc83a49872ebbd100c38bfd777e5a06492093510743bfe39d88ceee8f0a

➜ pm inspect 88 | jq ".[0].HostConfig.UsernsMode"
"private"  # ok!

(pm is just an alias to a podman daemon attached to the debugger)

@giuseppe

Copy link
Copy Markdown
Contributor

thanks, could you add a test?

@vmsh0 vmsh0 force-pushed the usernsmode_private_for_autons branch 2 times, most recently from ff22bd7 to 4c1d7bf Compare February 16, 2026 14:26
@vmsh0

vmsh0 commented Feb 16, 2026

Copy link
Copy Markdown
Contributor Author

Not sure why those two tests are failing, might be flakes? They are passing on my machine

@Honny1 Honny1 left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! I checked your test, and it passed. Can you please rebase on main?

@vmsh0 vmsh0 force-pushed the usernsmode_private_for_autons branch from 4c1d7bf to 65668af Compare February 19, 2026 13:27
@github-actions github-actions Bot added machine kind/api-change Change to remote API; merits scrutiny governance labels Feb 19, 2026
@mheon

mheon commented Feb 19, 2026

Copy link
Copy Markdown
Contributor

I think you messed up your rebase a bit, got a lot of extra commits in.

Changes LGTM for reference.

@vmsh0

vmsh0 commented Feb 19, 2026

Copy link
Copy Markdown
Contributor Author

It looks like I did, let me check what happened :)

@vmsh0 vmsh0 force-pushed the usernsmode_private_for_autons branch from 65668af to 75affa5 Compare February 19, 2026 13:43
@vmsh0

vmsh0 commented Feb 19, 2026

Copy link
Copy Markdown
Contributor Author

Ok, fixed it. I think some tags got added to the PR as a result of my mistake, I'm guessing they should be removed. Sorry!

@Honny1 Honny1 removed machine kind/api-change Change to remote API; merits scrutiny governance labels Feb 19, 2026
@Honny1

Honny1 commented Feb 19, 2026

Copy link
Copy Markdown
Contributor

I was able to track the source of the error down to the go-systemd library. The GetHandle function tries to get a handle to a shared library (.so) by attempting to access the names specified in libs, returning the first one that successfully opens. Is it possible that the APIv2 test image is missing something? @lsm5

@vmsh0

vmsh0 commented Feb 25, 2026

Copy link
Copy Markdown
Contributor Author

Let me know when/if you need another rebase :)

@mheon

mheon commented Feb 25, 2026

Copy link
Copy Markdown
Contributor

I think the APIv2 failures are flakes, restarted. I'll enable auto-merge.

@mheon mheon enabled auto-merge February 25, 2026 16:06
@Honny1

Honny1 commented Mar 11, 2026

Copy link
Copy Markdown
Contributor

Rerun failed tests.

auto-merge was automatically disabled March 14, 2026 15:11

Head branch was pushed to by a user without write access

@vmsh0

vmsh0 commented Apr 30, 2026

Copy link
Copy Markdown
Contributor Author

I am not sure about the failure: https://cirrus-ci.com/task/4799819752931328, but it appears unrelated.

PTAL @containers/podman-maintainers @mheon @timcoding1988 @lsm5

That failure has been there since day 1 of this PR. It does look unrelated to me as well, but the fact is that when I pushed the dummy commit earlier today the API tests passed. On the other hand, I really don't see how my changes are affecting those tests. Could it be my new tests in 20-containers.at that leave behind some kind of bad state, as opposed to the change itself?

It's especially unclear to me how my change could cause the library to fail to load. The error in the API tests is "unable to open a handle to the library".

@Honny1

Honny1 commented Apr 30, 2026

Copy link
Copy Markdown
Contributor

@vmsh0 I agree. Let's wait for what other maintainers think.

@mtrmac

mtrmac commented Apr 30, 2026

Copy link
Copy Markdown
Contributor

unable to open a handle to the library (should be improved to include the name of the library and) seems to refer to a dlopen failing to find a .so library, and that is somewhere in go-systemd.

I agree that it seems unrelated to this PR just at a first glance, but I don’t think that’s the primary thing to consider here. We should not be in the habit of ignoring deterministic failures (Ideally we wouldn’t even accept existence of flakes, but that’s a higher bar).

  • If we have universally broken CI, that’s a high-priority situation that needs to be fixed; we should never get into the habit of ignoring test failures.
  • … but, looking at recent commit history, and recent open PRs, the test seems to be passing there if I’m checking correctly?! That would indicate that there is indeed something about this PR that does cause it, or at least exposes a pre-existing failure.

@vmsh0

vmsh0 commented Apr 30, 2026

Copy link
Copy Markdown
Contributor Author

unable to open a handle to the library (should be improved to include the name of the library and) seems to refer to a dlopen failing to find a .so library, and that is somewhere in go-systemd.

I agree that it seems unrelated to this PR just at a first glance, but I don’t think that’s the primary thing to consider here. We should not be in the habit of ignoring deterministic failures (Ideally we wouldn’t even accept existence of flakes, but that’s a higher bar).

* If we have universally broken CI, that’s a high-priority situation that needs to be fixed; we should never get into the habit of ignoring test failures.

* … but, looking at recent commit history, and recent open PRs, the test seems to be passing there if I’m checking correctly?! That would indicate that there is indeed _something_ about this PR that does cause it, or at least exposes a pre-existing failure.

What's interesting is that we have the same in #27988, which is a different (but somewhat related) change. Can I play around with the tests? E.g. I would like to force-push a version of this change without the additional tests.

@mtrmac

mtrmac commented Apr 30, 2026

Copy link
Copy Markdown
Contributor

Sure, adding code to help diagnose it would be welcome. Just mark the PR as draft first, please.

@vmsh0 vmsh0 marked this pull request as draft April 30, 2026 16:32
@vmsh0 vmsh0 force-pushed the usernsmode_private_for_autons branch from 7302dcf to 4ba2629 Compare April 30, 2026 16:49
@packit-as-a-service

Copy link
Copy Markdown

Cockpit tests failed for commit de48be9. @martinpitt, @jelly, @mvollmer please check.

@packit-as-a-service

Copy link
Copy Markdown

Cockpit tests failed for commit 6949f84. @martinpitt, @jelly, @mvollmer please check.

@packit-as-a-service

Copy link
Copy Markdown

[NON-BLOCKING] Packit jobs failed. @containers/packit-build please check. Everyone else, feel free to ignore.

@packit-as-a-service

Copy link
Copy Markdown

Cockpit tests failed for commit 14904a8. @martinpitt, @jelly, @mvollmer please check.

@packit-as-a-service

Copy link
Copy Markdown

[NON-BLOCKING] Packit jobs failed. @containers/packit-build please check. Everyone else, feel free to ignore.

@vmsh0 vmsh0 force-pushed the usernsmode_private_for_autons branch from 14904a8 to c04f7e0 Compare May 1, 2026 11:40
@packit-as-a-service

Copy link
Copy Markdown

Cockpit tests failed for commit c04f7e0. @martinpitt, @jelly, @mvollmer please check.

@packit-as-a-service

Copy link
Copy Markdown

[NON-BLOCKING] Packit jobs failed. @containers/packit-build please check. Everyone else, feel free to ignore.

@vmsh0

vmsh0 commented May 1, 2026

Copy link
Copy Markdown
Contributor Author

Sure, adding code to help diagnose it would be welcome. Just mark the PR as draft first, please.

My testing shows that

  1. APIv2 test group 26 fails only when I introduce the additional test in group 20 -- the create operation in 20 is sufficient to trigger the failure in 26
  2. changing the test introduced in group 20 to a simple create, without autons, results in 26 passing
  3. if I remove the additional test completely from the commit, and only introduce the code change, 26 passes

At a first glance, these findings mean that the test setting up the rootless container in group 20 changes something permanently in the testing environment. Perhaps a bug was uncovered here. I will try to investigate by reproducing the same environment in my lab and testing manually, as using your CI to run these experiments doesn't scale very well.

Any ideas of what we might be looking at?

@packit-as-a-service

Copy link
Copy Markdown

Cockpit tests failed for commit d262b7c. @martinpitt, @jelly, @mvollmer please check.

@mheon

mheon commented May 1, 2026

Copy link
Copy Markdown
Contributor

Does this reproduce with the Podman CLI, instead of the API?

@packit-as-a-service

Copy link
Copy Markdown

[NON-BLOCKING] Packit jobs failed. @containers/packit-build please check. Everyone else, feel free to ignore.

@vmsh0

vmsh0 commented May 1, 2026

Copy link
Copy Markdown
Contributor Author

Tried to repro on my system with:

IMAGE=quay.io/libpod/testimage:20241011
CTR="WaitTestingCtr"

podman create --userns=auto $IMAGE

podman rm -a -f &>/dev/null
podman create --name "${CTR}" "${IMAGE}" sh -c "exit 3"
cid=$(podman inspect --format '{{.Id}}' "${CTR}")

podman wait --condition stopped "${CTR}"

(sleep 1; podman start "${CTR}") & child_pid=$!
sleep 2
podman wait --condition exited "${CTR}"
echo "Wait returned: $?"
echo "Container exit code: $(podman inspect --format '{{.State.ExitCode}}' "${CTR}")"

wait "${child_pid}"

But it works just fine.

Might not be very accurate tho, as the command line doesn't expose the API 1:1. I'll try to reproduce this on a clean system similar to the CI rig next week.

@Honny1

Honny1 commented May 12, 2026

Copy link
Copy Markdown
Contributor

@vmsh0, do you have an opportunity to test that?

@vmsh0

vmsh0 commented Jun 10, 2026

Copy link
Copy Markdown
Contributor Author

@vmsh0, do you have an opportunity to test that?

Sorry for the delay, busy period. I made a couple more attempts but I haven't figured out anything useful yet.

@Honny1

Honny1 commented Jun 10, 2026

Copy link
Copy Markdown
Contributor

@vmsh0, do you have an opportunity to test that?

Sorry for the delay, busy period. I made a couple more attempts but I haven't figured out anything useful yet.

Thanks, can you reabse on main? We have a new CI, maybe it will be solved.

@vmsh0 vmsh0 force-pushed the usernsmode_private_for_autons branch from d262b7c to 3d5ba71 Compare June 10, 2026 10:44
@vmsh0

vmsh0 commented Jun 10, 2026

Copy link
Copy Markdown
Contributor Author

@vmsh0, do you have an opportunity to test that?

Sorry for the delay, busy period. I made a couple more attempts but I haven't figured out anything useful yet.

Thanks, can you reabse on main? We have a new CI, maybe it will be solved.

I think the new failure we have (static check failing) is unrelated to my changes, I see the exact same failure in other recent PRs.

@Honny1

Honny1 commented Jun 10, 2026

Copy link
Copy Markdown
Contributor

Yes, I rerun tests.

@Honny1 Honny1 left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Failure seems to be unrelated. Can you try rebasing on upstream main to resolve that? I spotted lint false positives for the last two days. It is strange. I will need to investigate this.

PTAL @podman-container-tools/podman-reviewers @podman-container-tools/podman-maintainers

Signed-off-by: Riccardo Paolo Bestetti <pbl@bestov.io>
@vmsh0 vmsh0 force-pushed the usernsmode_private_for_autons branch from 3d5ba71 to ea942ca Compare June 10, 2026 16:55
@vmsh0

vmsh0 commented Jun 11, 2026

Copy link
Copy Markdown
Contributor Author

LGTM. Failure seems to be unrelated. Can you try rebasing on upstream main to resolve that? I spotted lint false positives for the last two days. It is strange. I will need to investigate this.

PTAL @podman-container-tools/podman-reviewers @podman-container-tools/podman-maintainers

Please let me know when to rebase again

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants