Make the images available in ghcr.io #37

glehmann · 2025-07-24T15:14:21Z

Based on #35

I had to move the user logic to the image entrypoint to make the images usable by any user even if he/she is not using the usual 1000:1000 ids.

The logic for the repository creation was moved to the Dockerfiles to make the images buildable with the standard github action docker/build-push-action.

Hopefully it will help with the issue with the repository selection in #35.

psafont

🎉🎉

(I've only reviewed the last 3 commits)

glehmann · 2025-07-25T11:30:57Z

It works with docker but not with podman: we can't run with --userns=keep-id to be root at the container startup, but we need --userns=keep-id to keep the ownership of the files in the shared volume.

I'll propose something else.

glehmann · 2025-07-25T13:58:34Z

It should be good now.

I've used podman's option --userns=keep-id:uid=1000,gid=1000, to map the builder user to but I'm not sure if that's an option that appeared recently or not.
@ydirson could you test it with your podman version?

glehmann · 2025-07-25T14:00:10Z

I've also added 88d5bec to reduce the image sizes.

ydirson · 2025-07-30T11:48:11Z

I've also added 88d5bec to reduce the image sizes.

There is an idiom to squash everything into a single layer, but I also see docker build --squash.

Dockerfile-7.x

Dockerfile-8.x

Dockerfile-7.x

run.py

files/entrypoint.sh

build.sh

.github/workflows/docker.yml

Dockerfile-9.x

ydirson · 2025-07-31T09:49:32Z

.github/workflows/docker.yml

+          context: .
+          file: ./Dockerfile-8.x
+          push: true
+          tags: ghcr.io/${{ github.repository }}:8.2


We likely want those official floating tags to be set only when run on master, maybe we set particular tags for PRs?
Also, timestamped tags as is common may be interesting to have.

The workflow is only configured on master.
Other tags may be useful, as well as building for PRs, but we must consider cleaning up the old images.

The workflow is only configured on master.

Actually it seems to be configured for main instead :)

Also, there would be a reason for allowing it to run not just on master: detecting pipeline errors before they reach master

yes, that would be nice. Maybe push to the registry when on the master branch then.

Also, there would be a reason for allowing it to run not just on master: detecting pipeline errors before they reach master

This comes at the cost of extra complexity: all steps now have a branch check.

I'm surprised that the push action doesn't support logging in, or use protected deployments

It's a bit more complex not to upload, but we spare the complexity to clean up temporary images.
It seems like a good compromise to me—your opinion may vary, of course

ydirson · 2025-08-01T15:31:54Z

run.py

-        docker_args += ["--userns=keep-id", "--security-opt", "label=disable"]
+        # With podman we use the `--userns` option to map the builder user to
+        # the user on the system.
+        docker_args += [f"--userns=keep-id:uid=1000,gid=1000", "--security-opt", "label=disable"]


The fact the uid is preserved so entrypoint is entered as builder and not as root as in the Docker case is likely useful to document. But is it really a good idea to have such a big difference between the 2 runners? (especially as we want to move more root actions into the entrypoint in the future)

ydirson · 2025-08-01T15:32:47Z

run.py

+        # With docker, we modify the builder user in the entrypoint to match the
+        # uid:gid of the user launching the container, and then continue with
+        # the builder user thanks to gosu.
+        docker_args += ["-e", f'BUILDER_UID={os.getuid()}', "-e", f'BUILDER_GID={os.getgid()}']


a reader might ask "why gosu and and just su"

Then he/she will quickly find on the internet why :-)

I usually try to only document what is specific to the project.
Do you think there is something specific about gosu in that context that should be commented?

Finding out about gosu did not teach me the answer. I had to go to the docker doc to find out why in a docker context they recommend using gosu. We need at least a pointer to that docker doc.

I've added a note in entrypoint.sh

ydirson

This setup will store in separate places the container image and the script using it, but they are very much tied by an interface that was kept private until now. And recently we have started changing quite some of this interface.
Soon users will run a build and find out the hard way that they need to update their xcp-build-env script. Some safeguards would be useful here.

A simple solution would be some BUILDENV_INTERFACE_VERSION variable recorded in the image, which the init script would compare with a value passed at runtime. We would just need to be careful to bump this version on every interface change (envvars, bindmounts, ...)

glehmann · 2025-09-05T09:53:17Z

This setup will store in separate places the container image and the script using it, but they are very much tied by an interface that was kept private until now. And recently we have started changing quite some of this interface. Soon users will run a build and find out the hard way that they need to update their xcp-build-env script. Some safeguards would be useful here.

A simple solution would be some BUILDENV_INTERFACE_VERSION variable recorded in the image, which the init script would compare with a value passed at runtime. We would just need to be careful to bump this version on every interface change (envvars, bindmounts, ...)

That's a good point.
I'd rather use a docker tag for that, in order to make the image automatically downloaded on the dev computer when he/she updates it's xcp-ng-dev installation.

ydirson · 2025-09-08T10:07:22Z

.github/workflows/docker.yml

+          push: ${{ github.ref == 'refs/heads/master' }}
+          tags: ghcr.io/${{ github.repository }}:8.2-${{ env.VERSION }}


Is it possible to push it to a different tag (git-describe-based) so the test job can be reworked to test the generated container, instead of rebuilding its own?
I'm thinking about some equivalent of that gitlab pipeline

Sure, that should be possible. Could we keep it for another PR?

src/xcp_ng_dev/files/version.txt

src/xcp_ng_dev/files/Dockerfile-8.x

psafont · 2025-09-09T09:05:55Z

.github/workflows/docker.yml

@@ -0,0 +1,83 @@
+name: Build and Push Docker Image to GHCR
+
+on: push


Does the workflow do anything if it's not on master branch? I would rather limit the branches where it's run here rather than doing it per-step, now there are 4 places where this branch limitation needs to be placed (6 when the 9.0 is enabled, which is ripe for errors)

It builds the image on all the branches, but only uploads when on master

.github/workflows/docker.yml

psafont · 2025-09-09T09:33:18Z

.github/workflows/docker.yml

          file: ./src/xcp_ng_dev/files/Dockerfile-8.x
          push: ${{ github.ref == 'refs/heads/master' }}
-          tags: ghcr.io/${{ github.repository }}:8.2
+          tags: ghcr.io/${{ github.repository }}:8.2-${{ env.VERSION }}


so we can change the interface used to communicate with the container.

In all the years I've used containers I've never come across the need for a such a thing.

Are you sure this is a good idea? It forces all users to become aware of this concept (which I think it's very rare) and complicates using the tool. Now, which version does my installed tool support? is the container outdated, is my tool outdated? what if need to build from an old container, how do I get a compatible tool version?

This means at the very least that, in order to be introduced, users need to be educated about it: it needs to be documented in the readme and the help output.

I think this has quite a few downsides, and we should think long about what we're trying to solve with this solution. Are you sure this can't be replaced by something that users are already used to?
I would try to have frequent tagged releases for the tools, and use date versioning in the tags, on top of having X.X tags for the latest of each version of the containers.

It might be probably worth making the releases of the containers happen every Thursday, or even every day, to decouple them from pushing changes to the tool.

I don't see the image as being targeted to the end user as is. It's made to be used with xcp-ng-dev.
Do you have some intended usage for the image outside of xcp-ng-dev? If yes, I agree that's not the best tag naming.

I don't see the image as being targeted to the end user as is.

Imagine the case where the version is changed, and the user hasn't updated the cli tool. This will show an error to the user, which will it be, how can the user know how to solve the issue?

Another edge case, whihc should be rarer, but still useful, IMO: the user want to make a build on top of a previous container image because they are developing a feature, a newer version of the packages got released, the code between both is incompatible. Now they can rebase their work, but they might not have time to do that because of some deadline for a prototype. How do they know what container tag to use? With dates in tags becomes easier, which this version scheme only hashes are available, and then they have to fish the version of the tool that's compatible with that container

I don't see the image as being targeted to the end user as is.

Imagine the case where the version is changed, and the user hasn't updated the cli tool. This will show an error to the user, which will it be, how can the user know how to solve the issue?

That's the point of adding this version in the image tag, so the user can continue using the cli tool at the same version, without being disturbed by a non-backward compatible change.
When we'll switch to protocol-version 2, the image 8.3-1 would still be there, along the newer, and incompatible, 8.3-2.

Another edge case, whihc should be rarer, but still useful, IMO: the user want to make a build on top of a previous container image because they are developing a feature, a newer version of the packages got released, the code between both is incompatible. Now they can rebase their work, but they might not have time to do that because of some deadline for a prototype. How do they know what container tag to use? With dates in tags becomes easier, which this version scheme only hashes are available, and then they have to fish the version of the tool that's compatible with that container

They can just stay on their version of the cli tool, and they would download the image version expected by the tool, and thus compatible with the tool.

Just to make sure we are talking of the same thing: the cli tool is package with protocol-version.txt containing the version number it expects. The image run by the cli tool must have the protocol-version of the cli tool.

and that's all transparent to the user, who just has to specify the XCP-ng version

That's the point of adding this version in the image tag, so the user can continue using the cli tool at the same version, without being disturbed by a non-backward compatible change.

But then the container will stop being upgraded, and that might cause issues: like a user developing a package with the old container successfully, while it ends up incompatible with an up-to-date container. This means users need to be aware of this, or might fall into this trap.

They can just stay on their version of the cli tool, and they would download the image version expected by the tool, and thus compatible with the tool.

Will we keep generating images for both the old and new versions to ensure the first version is not outdated?

Just to make sure we are talking of the same thing: the cli tool is package with protocol-version.txt containing the version number it expects. The image run by the cli tool must have the protocol-version of the cli tool.

Yes, this is exaclty what we're talking about. I've never seen this pattern in any Dockerfile I've come across in the +10 years I've used containers, and I see it makes issues that are not obivous to detect, which I find alarming.

and that's all transparent to the user, who just has to specify the XCP-ng version

On the surface level it looks like it's transparent, but looking into it in more detail, it's really not.

I'll move that commit to another PR so we can move forward

But then the container will stop being upgraded, and that might cause issues: like a user developing a package with the old container successfully, while it ends up incompatible with an up-to-date container. This means users need to be aware of this, or might fall into this trap.

I'll move that commit to another PR so we can move forward

Well, OTOH when a newer version of the container gets published, and a user has an old one, the new one is not automatically pulled either. OTOH a user upgrading the build script to an incompatible version will get an error instead of the new image being automatically downloaded.

Will we keep generating images for both the old and new versions to ensure the first version is not outdated?

No, we rather talked about issuing a warning when the build script is outdated.

We can continue the discussion in #51

ydirson · 2025-09-10T12:07:20Z

src/xcp_ng_dev/files/entrypoint.sh

+if [ "${BUILDER_UID}" ]; then
+    # BUILDER_UID is defined, update the builder ids, and continue with the builder user
+    if [ "${BUILDER_GID}" != "1000" ]; then
+        groupmod -g "${BUILDER_GID}" builder
+    fi
+    if [ "${BUILDER_UID}" != "1000" ]; then
+        usermod -u "${BUILDER_UID}" -g "${BUILDER_GID}" builder
+    fi
+    find ~builder -maxdepth 1 -type f | xargs chown builder:builder
+    exec /usr/local/bin/gosu builder "$@"
+else
+    # no BUILDER_ID, just continue as the current user
+    exec "$@"
+fi


Suggested change

if [ "${BUILDER_UID}" ]; then

# BUILDER_UID is defined, update the builder ids, and continue with the builder user

if [ "${BUILDER_GID}" != "1000" ]; then

groupmod -g "${BUILDER_GID}" builder

fi

if [ "${BUILDER_UID}" != "1000" ]; then

usermod -u "${BUILDER_UID}" -g "${BUILDER_GID}" builder

fi

find ~builder -maxdepth 1 -type f | xargs chown builder:builder

exec /usr/local/bin/gosu builder "$@"

else

# no BUILDER_ID, just continue as the current user

exec "$@"

fi

if [ "${BUILDER_UID}" ]; then

# BUILDER_UID is defined (i.e. this is Docker, and we execute this as root),

# update the builder ids, and continue with the builder user

if [ "${BUILDER_GID}" != "1000" ]; then

groupmod -g "${BUILDER_GID}" builder

fi

if [ "${BUILDER_UID}" != "1000" ]; then

usermod -u "${BUILDER_UID}" -g "${BUILDER_GID}" builder

fi

find ~builder -maxdepth 1 -type f | xargs chown builder:builder

exec /usr/local/bin/gosu builder "$@"

else

# no BUILDER_ID (i.e. this is Podman), just continue as the current user

exec "$@"

fi

But actually, can't we just avoid this with userns-remapping in Docker?

IIRC, it's a daemon lever configuration, and thus might not be easy to do for our developers, may have impact on their other containers, and may not be possible in the CI

Damned, you're right. I wonder if we could make something similar to podman work with rootless docker, at least that should be user-only configuration (and we may want to encourage/enforce this to push for better security)

Others may disagree, but if I want a rootless docker, I use podman 🙂

I see you'll soon agree to drop support for docker 😉

I would, if the others would use podman.
I want to keep docker only because I'm quite sure it will be the most used container engine for a few more years…

I reported a couple of bugs for rootless podman when using it for the internal xenserver tooling, back in 2019.

I thought the whole linux world had moved away from docker already because of its weird packaging choices, not to mention its rootful architecture

Signed-off-by: Gaëtan Lehmann <[email protected]>

This way the image is usable by everybody, independantly of its ids. With docker, we modify the builder user in the entrypoint to match the uid:gid of the user launching the container, and then continue with the builder user thanks to gosu. With podman we use the `--userns` option to map the builder user to the user on the system. I haven't found a way with podman to use the same mechanism as in docker, and vis versa. Signed-off-by: Gaëtan Lehmann <[email protected]>

The images are always built, but only pushed to the registry when running on the master branch. Signed-off-by: Gaëtan Lehmann <[email protected]>

for a significantly smaller image size. Before: ghcr.io/xcp-ng/xcp-ng-build-env 8.2 e985e704c252 10 minutes ago 1.03 GB ghcr.io/xcp-ng/xcp-ng-build-env 8.3 b557dcb6d541 11 minutes ago 1.14 GB ghcr.io/xcp-ng/xcp-ng-build-env 9.0 c738df7d5577 11 minutes ago 857 MB After: ghcr.io/xcp-ng/xcp-ng-build-env 8.2 a9a91fa03db8 7 seconds ago 491 MB ghcr.io/xcp-ng/xcp-ng-build-env 8.3 a36b1399ac01 About a minute ago 557 MB ghcr.io/xcp-ng/xcp-ng-build-env 9.0 78011dcbd6f3 11 minutes ago 708 MB Signed-off-by: Gaëtan Lehmann <[email protected]>

ydirson

This will do. I'm still not happy having a complicated behavior just to get compatibility with Docker: IMHO the container engine should indeed be seen as an implementation, and we should rather tell devs to install Podman. But the code is well delimited, and we can remove it when we decide to.

glehmann requested review from psafont and ydirson July 24, 2025 15:14

psafont approved these changes Jul 24, 2025

View reviewed changes

glehmann force-pushed the gln/build-env-improvements-lzqx branch 6 times, most recently from 9e73f03 to 87724c2 Compare July 25, 2025 08:14

glehmann force-pushed the gln/build-env-improvements-lzqx branch 3 times, most recently from 78407fa to 7998f39 Compare July 25, 2025 13:57

glehmann force-pushed the gln/build-env-improvements-lzqx branch from 7998f39 to 88d5bec Compare July 25, 2025 14:22

ydirson requested changes Jul 30, 2025

View reviewed changes

ydirson reviewed Jul 31, 2025

View reviewed changes

glehmann force-pushed the gln/build-env-improvements-lzqx branch 2 times, most recently from e6311bc to 248ec8f Compare August 1, 2025 14:12

glehmann marked this pull request as ready for review August 1, 2025 14:12

glehmann requested a review from ydirson August 1, 2025 14:13

ydirson reviewed Aug 1, 2025

View reviewed changes

glehmann force-pushed the gln/build-env-improvements-lzqx branch 2 times, most recently from 57683dd to e21ff80 Compare August 1, 2025 17:33

ydirson requested changes Sep 4, 2025

View reviewed changes

glehmann force-pushed the gln/build-env-improvements-lzqx branch from e21ff80 to da09de3 Compare September 5, 2025 09:32

glehmann force-pushed the gln/build-env-improvements-lzqx branch from da09de3 to b7eb0a5 Compare September 5, 2025 11:26

glehmann force-pushed the gln/build-env-improvements-lzqx branch 4 times, most recently from 8cb51c2 to 644d3ed Compare September 5, 2025 12:19

glehmann requested a review from ydirson September 5, 2025 12:21

This was referenced Sep 5, 2025

Remove xcp-ng-dev-env-create #46

Merged

Import project configuration from python-project-template #47

Merged

glehmann force-pushed the gln/build-env-improvements-lzqx branch 2 times, most recently from 223c7b1 to 3d7b5ae Compare September 6, 2025 12:14

ydirson requested changes Sep 8, 2025

View reviewed changes

glehmann force-pushed the gln/build-env-improvements-lzqx branch 2 times, most recently from b26e293 to 30f8a45 Compare September 8, 2025 16:41

psafont reviewed Sep 9, 2025

View reviewed changes

.github/workflows/docker.yml Show resolved Hide resolved

psafont reviewed Sep 9, 2025

View reviewed changes

glehmann force-pushed the gln/build-env-improvements-lzqx branch 3 times, most recently from f07d40a to c4112b4 Compare September 10, 2025 12:45

ydirson reviewed Sep 10, 2025

View reviewed changes

glehmann mentioned this pull request Sep 10, 2025

Add a version in the image tag #51

Open

glehmann added 4 commits September 10, 2025 16:55

Remove XCP-ng 7 support

8bf20c6

Signed-off-by: Gaëtan Lehmann <[email protected]>

Use ghcr.io to store the images

8b03843

The images are always built, but only pushed to the registry when running on the master branch. Signed-off-by: Gaëtan Lehmann <[email protected]>

glehmann force-pushed the gln/build-env-improvements-lzqx branch from c4112b4 to 1c18d26 Compare September 10, 2025 15:33

glehmann requested a review from ydirson September 10, 2025 15:35

ydirson approved these changes Sep 10, 2025

View reviewed changes

ydirson requested a review from stormi September 11, 2025 07:55

glehmann merged commit e9cafe6 into master Sep 11, 2025
6 checks passed

glehmann deleted the gln/build-env-improvements-lzqx branch September 11, 2025 08:02

		push: ${{ github.ref == 'refs/heads/master' }}
		tags: ghcr.io/${{ github.repository }}:8.2-${{ env.VERSION }}

		@@ -0,0 +1,83 @@
		name: Build and Push Docker Image to GHCR

		on: push

Make the images available in ghcr.io #37

Make the images available in ghcr.io #37

Uh oh!

Conversation

glehmann commented Jul 24, 2025

Uh oh!

psafont left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

glehmann commented Jul 25, 2025

Uh oh!

glehmann commented Jul 25, 2025

Uh oh!

glehmann commented Jul 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ydirson commented Jul 30, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ydirson left a comment

Choose a reason for hiding this comment

Uh oh!

glehmann commented Sep 5, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

psafont Sep 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

glehmann Sep 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

psafont Sep 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

psafont left a comment •

edited

Loading

glehmann commented Jul 25, 2025 •

edited

Loading

psafont Sep 9, 2025 •

edited

Loading

glehmann Sep 10, 2025 •

edited

Loading

psafont Sep 9, 2025 •

edited

Loading