Skip to content

Fix podman network backend fallback on RHEL 9 (HMS-8783)#1365

Draft
thozza wants to merge 7 commits into
osbuild:mainfrom
thozza:el9-podman-network_backend-fix
Draft

Fix podman network backend fallback on RHEL 9 (HMS-8783)#1365
thozza wants to merge 7 commits into
osbuild:mainfrom
thozza:el9-podman-network_backend-fix

Conversation

@thozza
Copy link
Copy Markdown
Member

@thozza thozza commented Mar 27, 2025

RHEL 9 Podman falls back to the legacy 'cni' network backend when it finds pre-existing container images in storage, interpreting them as a migration from an older version. Since the 'cni' packages are not installed by default, this silently breaks networking for images with embedded containers. Fix this by writing a defaultNetworkBackend file into container storage during image build, forcing Podman to use 'netavark'. Add host-config checks to detect this inconsistency and verify container embedding at boot time.

Architectural Changes

Introduce a container.NetworkBackend type and a GenDefaultNetworkBackendFile helper in pkg/container to generate the /var/lib/containers/storage/defaultNetworkBackend file. This file is a Podman-native mechanism that overrides backend auto-detection. The helper accepts a custom storage path to support OSTree images, which relocate container storage to /usr/share/containers/storage. The new PodmanDefaultNetBackend field on ImageConfig drives this behavior declaratively from the distro definition YAML.

Key Changes

  • Add GenDefaultNetworkBackendFile in pkg/container/podman.go to produce the backend override file, supporting both default and relocated storage paths
  • Add PodmanDefaultNetBackend option to ImageConfig and wire it into osCustomizations, gated on both the option being set and containers being present
  • Set podman_default_net_backend: "netavark" in the RHEL 9 distro definition; Fedora and RHEL 10 are intentionally left unset since they don't have the fallback bug
  • Add container-embedding host-config check that verifies blueprint containers are present in booted image podman storage
  • Add podman-network-backend host-config check that detects rootful/rootless backend mismatches

Breaking Changes

This PR is fully backward compatible.

Testing

  • Table-driven unit tests for GenDefaultNetworkBackendFile covering netavark/cni backends and default/custom storage paths
  • osCustomizations tests verifying the backend file is generated only when both containers and the option are present, including the OSTree relocated-path case
  • Distro-level cross-checks asserting RHEL 9 has the backend set to netavark while Fedora and RHEL 10 leave it nil
  • Host-config checks tested with mock exec, covering container name matching (including podman's localhost/ normalization), backend consistency, and error paths

@thozza thozza requested review from a team and achilleas-k as code owners March 27, 2025 13:13
@thozza thozza requested review from mvo5 and schuellerf March 27, 2025 13:13
@thozza thozza force-pushed the el9-podman-network_backend-fix branch from 0cf9369 to 551bc50 Compare March 27, 2025 13:18
achilleas-k
achilleas-k previously approved these changes Mar 27, 2025
Copy link
Copy Markdown
Member

@achilleas-k achilleas-k left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really nice! Thank.

The last commit message is a bit too long. Can you make it shorter?

@thozza thozza marked this pull request as draft March 27, 2025 13:38
@thozza
Copy link
Copy Markdown
Member Author

thozza commented Mar 27, 2025

Really nice! Thank.

The last commit message is a bit too long. Can you make it shorter?

Will do.

I hoped that GH will mark this PR as a Draft. 😇 My plan is first to see where the consistency check fails, before pushing any fix for RHEL-9.

@thozza thozza force-pushed the el9-podman-network_backend-fix branch 2 times, most recently from d27bc35 to 4b9ecd5 Compare March 28, 2025 15:44
@thozza thozza force-pushed the el9-podman-network_backend-fix branch 2 times, most recently from d647bc5 to 341bda0 Compare April 8, 2025 11:31
@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 9, 2025

This PR is stale because it has been open 30 days with no activity. Remove "Stale" label or comment or this will be closed in 7 days.

@github-actions github-actions Bot added the Stale label May 9, 2025
@thozza thozza removed the Stale label May 12, 2025
@achilleas-k
Copy link
Copy Markdown
Member

achilleas-k commented May 12, 2025

Do we still want this?

EDIT: Nvm, noticed you un-staled it.

@github-actions

This comment was marked as outdated.

@github-actions github-actions Bot added the Stale label Jun 12, 2025
@thozza thozza removed the Stale label Jun 16, 2025
@thozza thozza changed the title Draft: Check and fix for podman network_backend consistency when embedding containers Draft: Check and fix for podman network_backend consistency when embedding containers (HMS-8783) Jul 3, 2025
@github-actions

This comment was marked as outdated.

@github-actions github-actions Bot added the Stale label Aug 3, 2025
@achilleas-k
Copy link
Copy Markdown
Member

boop

@github-actions github-actions Bot removed the Stale label Aug 7, 2025
@github-actions

This comment was marked as outdated.

@github-actions

This comment was marked as outdated.

@github-actions github-actions Bot added the Stale label Oct 12, 2025
@github-actions github-actions Bot added the Stale label Jan 19, 2026
@thozza thozza removed the Stale label Jan 19, 2026
@github-actions
Copy link
Copy Markdown

This PR is stale because it had no activity for the past 30 days. Remove the "Stale" label or add a comment, otherwise this PR will be closed in 7 days.

@github-actions github-actions Bot added the Stale label Feb 19, 2026
@thozza thozza removed the Stale label Feb 23, 2026
@thozza thozza force-pushed the el9-podman-network_backend-fix branch from 341bda0 to 219ac1e Compare March 17, 2026 13:50
@thozza
Copy link
Copy Markdown
Member Author

thozza commented Mar 17, 2026

I just pushed a reworked WIP I had locally, since I won't have time to finish it in the upcoming weeks. That said, I didn't want to keep it just locally...

@github-actions
Copy link
Copy Markdown

This PR is stale because it had no activity for the past 30 days. Remove the "Stale" label or add a comment, otherwise this PR will be closed in 7 days.

@github-actions github-actions Bot added the Stale label Apr 17, 2026
@thozza thozza removed the Stale label Apr 17, 2026
@github-actions
Copy link
Copy Markdown

This PR is stale because it had no activity for the past 30 days. Remove the "Stale" label or add a comment, otherwise this PR will be closed in 7 days.

@github-actions github-actions Bot added the Stale label May 17, 2026
@thozza thozza removed the Stale label May 18, 2026
@thozza thozza force-pushed the el9-podman-network_backend-fix branch 6 times, most recently from 8d4cf99 to 8631167 Compare May 28, 2026 17:51
@thozza thozza changed the title Draft: Check and fix for podman network_backend consistency when embedding containers (HMS-8783) Fix podman network backend fallback on RHEL 9 (HMS-8783) May 28, 2026
thozza added 7 commits May 28, 2026 21:40
Verify that containers listed in the blueprint are actually present in
the booted image's podman storage.

Signed-off-by: Tomáš Hozza <thozza@redhat.com>
Verify that rootful and rootless podman report the same network
backend. When containers are embedded as root into the image (the
default behavior), some podman versions interpret the existing storage
as a migration and fall back to 'cni' for rootful only, leaving
rootless on 'netavark'. In practice, the desired behavior is that podman
uses the same network backend, regardless if there is an embedded
container or not.

Signed-off-by: Tomáš Hozza <thozza@redhat.com>
Certain versions of Podman (notably on RHEL 9) fall back to the legacy
'cni' network backend when they find existing container images in the
system storage, assuming a migration from an older version. This is
problematic for disk images that embed containers as a customization,
because the legacy backend packages are not installed by default.

Add a NetworkBackend type and a helper to generate the
/var/lib/containers/storage/defaultNetworkBackend file, which tells
Podman which backend to use and prevents the unwanted fallback.

Signed-off-by: Tomáš Hozza <thozza@redhat.com>
Add a new ImageConfig option that specifies the default network backend
for Podman. When set and the image embeds container images, the value is
written to /var/lib/containers/storage/defaultNetworkBackend during
image build.

This prevents Podman from falling back to the legacy 'cni' backend when
it finds pre-existing container images in storage, which it interprets
as a system migration.

Signed-off-by: Tomáš Hozza <thozza@redhat.com>
RHEL 9 Podman falls back to the legacy 'cni' network backend when it
finds pre-existing container images in storage, but the 'cni' packages
are not installed by default. Force 'netavark' so that images with
embedded containers work out of the box.

This only affects RHEL 9 / CentOS Stream 9; newer distros (and newer
podman versions) don't have this fallback logic.

Regenerate test manifests. All el9 / c9s manifests that embed
containers now get the podman default network backend set.

Signed-off-by: Tomáš Hozza <thozza@redhat.com>
Verify the podman default network backend behavior at multiple levels:
- YAML loading: fake distro YAML with podman_default_net_backend loads
  correctly into ImageConfig
- osCustomizations: the backend file is generated only when both
  containers are present and the option is set
- Real distro cross-check: RHEL 9 has it set to netavark, while
  Fedora and RHEL 10 leave it unset

Signed-off-by: Tomáš Hozza <thozza@redhat.com>
Make sure that the newly added checks are run on all images.

Signed-off-by: Tomáš Hozza <thozza@redhat.com>
@thozza thozza force-pushed the el9-podman-network_backend-fix branch from 8631167 to 723cbd6 Compare May 28, 2026 19:42
},
},
wantErr: check.ErrCheckFailed,
},
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should also test the empty backend string / unknown cases.
Also, what's the expected result if both are unknown? Should that be a pass or a fail?

@brlane-rht
Copy link
Copy Markdown
Contributor

I know this is still a draft, and I'm curious to see what Schutzbot thinks of it. It looks pretty nice to me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants