Fix healthcheck failing silently with `--transient-store` by Honny1 · Pull Request #28498 · podman-container-tools/podman

Honny1 · 2026-04-13T16:35:35Z

The systemd timer created for health checks did not pass --transient-store to the podman subprocess, causing it to look up the container in the default store instead of the volatile one.

Fixes: #28483

Checklist

Ensure you have completed the following checklist for your pull request to be reviewed:

Certify you wrote the patch or otherwise have the right to pass it on as an open-source patch by signing all
commits. (git commit -s). (If needed, use git commit -s --amend). The author email must match
the sign-off email address. See CONTRIBUTING.md
for more information.
Referenced issues using Fixes: #00000 in commit message (if applicable)
Tests have been added/updated (or no tests are needed)
Documentation has been updated (or no documentation changes are needed)
All commits pass make validatepr (format/lint checks)
Release note entered in the section below (or None if no user-facing changes)

Does this PR introduce a user-facing change?

Fixed health checks silently failing for containers started with `--transient-store`

packit-as-a-service · 2026-04-13T16:38:49Z

[NON-BLOCKING] Packit jobs failed. @containers/packit-build please check. Everyone else, feel free to ignore.

packit-as-a-service · 2026-04-13T16:38:49Z

[NON-BLOCKING] Packit jobs failed. @containers/packit-build please check. Everyone else, feel free to ignore.

packit-as-a-service · 2026-04-13T16:38:49Z

[NON-BLOCKING] Packit jobs failed. @containers/packit-build please check. Everyone else, feel free to ignore.

Luap99

mhh, while correct If I look at this aren't most argument wrong if we look further, i.e. see CreateExitCommandArgs() for a proper list of argument we must pass through.

So I think it would be worth the effort to consolidate that further and share the same code

Honny1 · 2026-04-13T17:39:52Z

mhh, while correct If I look at this aren't most argument wrong if we look further, i.e. see CreateExitCommandArgs() for a proper list of argument we must pass through.

So I think it would be worth the effort to consolidate that further and share the same code

Sure

mheon · 2026-04-13T19:14:00Z

I agree with the comment on consolidation. For what it's worth, the test LGTM, though it sucks to add more waits taking up time in the tests.

Luap99 · 2026-04-14T11:39:48Z

+    run_podman --transient-store inspect $ctr --format "{{.State.Health.Status}} {{.State.Health.FailingStreak}}"
+    assert "$output" == "healthy 0" "health status and failing streak"
+
+    run_podman --transient-store rm -f -t0 $ctr


just one problem I noticed, if the test case fails then the container is leaked and the regular teardown has no way to know it exists due the --transient-store option.

I don't really want to add --transient-store handling to the general teardown as this would slow things a lot down if we would have to double all commands there...

I guess the best I can think of would be move this to a 221-healthcheck-transient.bats and define a custom teardown there that has the right --transient-store call to remove the cotnainer even on errors

Luap99 · 2026-04-14T11:44:44Z

not ok 222 |220| podman healthcheck --transient-store in 974ms
         # tags: ci:parallel
         # (from function `bail-now' in file test/system/[helpers.bash, line 230](https://github.com/containers/podman/blob/efb2e0d9b521033b0f5d468245a6d7ef8238d76a/test/system/helpers.bash#L230),
         #  from function `die' in file test/system/[helpers.bash, line 967](https://github.com/containers/podman/blob/efb2e0d9b521033b0f5d468245a6d7ef8238d76a/test/system/helpers.bash#L967),
         #  from function `run_podman' in file test/system/[helpers.bash, line 608](https://github.com/containers/podman/blob/efb2e0d9b521033b0f5d468245a6d7ef8238d76a/test/system/helpers.bash#L608),
         #  in test file test/system/[220-healthcheck.bats, line 526](https://github.com/containers/podman/blob/efb2e0d9b521033b0f5d468245a6d7ef8238d76a/test/system/220-healthcheck.bats#L526))
         #   `run_podman run -d --name $ctr --transient-store \' failed
         #
<+     > # # podman  run -d --name c-h-t222-ksrgjc1q --transient-store --health-cmd /home/podman/healthcheck --health-interval 1s --health-retries 3 quay.io/libpod/testimage:20241011 /home/podman/pause
<+747ms> # Error: crun: executable file `/home/podman/pause` not found: No such file or directory: OCI runtime attempted to invoke a command that was not found
<+005ms> # [ rc=127 (** EXPECTED 0 **) ]
         # #/vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv
         # #| FAIL: exit code is 127; expected 0
         # #\^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
         # # [teardown]

The test is flaking a lot too so this cannot merged like that. The image of course has the file so I suspect it is a race condition around the mount point somehow? Maybe best if is in a extra file to not have this marked as parallel safe

The systemd timer created for health checks did not pass global podman flags to the subprocess, causing it to use default storage settings instead of matching the parent process. This is most visible with --transient-store, where the healthcheck looks up the container in the default store instead of the volatile one. Extract GlobalPodmanArgs() from CreateExitCommandArgs so both the exit command and healthcheck timer share the same set of global flags (--root, --runroot, --transient-store, --storage-driver, etc.). Fixes: podman-container-tools#28483 Signed-off-by: Jan Rodák <hony.com@seznam.cz>

Luap99

LGTM

Honny1 · 2026-04-14T13:07:17Z

not ok 222 |220| podman healthcheck --transient-store in 974ms
         # tags: ci:parallel
         # (from function `bail-now' in file test/system/[helpers.bash, line 230](https://github.com/containers/podman/blob/efb2e0d9b521033b0f5d468245a6d7ef8238d76a/test/system/helpers.bash#L230),
         #  from function `die' in file test/system/[helpers.bash, line 967](https://github.com/containers/podman/blob/efb2e0d9b521033b0f5d468245a6d7ef8238d76a/test/system/helpers.bash#L967),
         #  from function `run_podman' in file test/system/[helpers.bash, line 608](https://github.com/containers/podman/blob/efb2e0d9b521033b0f5d468245a6d7ef8238d76a/test/system/helpers.bash#L608),
         #  in test file test/system/[220-healthcheck.bats, line 526](https://github.com/containers/podman/blob/efb2e0d9b521033b0f5d468245a6d7ef8238d76a/test/system/220-healthcheck.bats#L526))
         #   `run_podman run -d --name $ctr --transient-store \' failed
         #
<+     > # # podman  run -d --name c-h-t222-ksrgjc1q --transient-store --health-cmd /home/podman/healthcheck --health-interval 1s --health-retries 3 quay.io/libpod/testimage:20241011 /home/podman/pause
<+747ms> # Error: crun: executable file `/home/podman/pause` not found: No such file or directory: OCI runtime attempted to invoke a command that was not found
<+005ms> # [ rc=127 (** EXPECTED 0 **) ]
         # #/vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv
         # #| FAIL: exit code is 127; expected 0
         # #\^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
         # # [teardown]

The test is flaking a lot too so this cannot merged like that. The image of course has the file so I suspect it is a race condition around the mount point somehow? Maybe best if is in a extra file to not have this marked as parallel safe

It seems that it helped. Thanks, this would take me some time. I restarted only failed Docker-py Compat. job. Packit f44 jobs seem to be unrelated.

Honny1 · 2026-04-15T12:26:38Z

PTAL @containers/podman-maintainers

ashley-cui · 2026-04-15T13:59:09Z

LGTM

Honny1 marked this pull request as ready for review April 13, 2026 16:41

Luap99 reviewed Apr 13, 2026

View reviewed changes

Honny1 force-pushed the hc-transient-store branch from 6788638 to efb2e0d Compare April 14, 2026 09:12

Luap99 reviewed Apr 14, 2026

View reviewed changes

Honny1 force-pushed the hc-transient-store branch from efb2e0d to 9598b30 Compare April 14, 2026 12:15

Luap99 approved these changes Apr 14, 2026

View reviewed changes

Luap99 merged commit e4776a2 into podman-container-tools:main Apr 15, 2026
80 of 83 checks passed

Luap99 mentioned this pull request May 21, 2026

Healthcheck fails/hangs if containers started through a different storage configuration #28750

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix healthcheck failing silently with `--transient-store`#28498

Fix healthcheck failing silently with `--transient-store`#28498
Luap99 merged 1 commit into
podman-container-tools:mainfrom
Honny1:hc-transient-store

Honny1 commented Apr 13, 2026

Uh oh!

packit-as-a-service Bot commented Apr 13, 2026

Uh oh!

packit-as-a-service Bot commented Apr 13, 2026

Uh oh!

packit-as-a-service Bot commented Apr 13, 2026

Uh oh!

Luap99 left a comment

Uh oh!

Honny1 commented Apr 13, 2026

Uh oh!

mheon commented Apr 13, 2026

Uh oh!

Luap99 Apr 14, 2026

Uh oh!

Luap99 commented Apr 14, 2026

Uh oh!

Luap99 left a comment

Uh oh!

Honny1 commented Apr 14, 2026

Uh oh!

Honny1 commented Apr 15, 2026

Uh oh!

ashley-cui commented Apr 15, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

Honny1 commented Apr 13, 2026

Checklist

Does this PR introduce a user-facing change?

Uh oh!

packit-as-a-service Bot commented Apr 13, 2026

Uh oh!

packit-as-a-service Bot commented Apr 13, 2026

Uh oh!

packit-as-a-service Bot commented Apr 13, 2026

Uh oh!

Luap99 left a comment

Choose a reason for hiding this comment

Uh oh!

Honny1 commented Apr 13, 2026

Uh oh!

mheon commented Apr 13, 2026

Uh oh!

Luap99 Apr 14, 2026

Choose a reason for hiding this comment

Uh oh!

Luap99 commented Apr 14, 2026

Uh oh!

Luap99 left a comment

Choose a reason for hiding this comment

Uh oh!

Honny1 commented Apr 14, 2026

Uh oh!

Honny1 commented Apr 15, 2026

Uh oh!

ashley-cui commented Apr 15, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants