Skip to content

Add RHEL Docker images#3

Merged
bthomee merged 32 commits intomainfrom
bthomee/rhel
Jul 1, 2025
Merged

Add RHEL Docker images#3
bthomee merged 32 commits intomainfrom
bthomee/rhel

Conversation

@bthomee
Copy link
Contributor

@bthomee bthomee commented Jun 17, 2025

This PR adds Docker images for Red Hat Enterprise Linux (RHEL) 9.6 with supported GCC or Clang compilers.

  • The images are based on the Universal Binary Images (UBI) from Red Hat, which can be used without having to pay for a license.
  • To pull an image, a free Red Hat developer account is needed.
    • I created such an account, and then a service account with the permissions to pull packages from their registry; finally I created a token for that service account and added it to the repo as a secret.
    • I did not pursue seeing if we can get an organizational account to avoid the dependence on myself, but since it's trivial to create a token for free and we won't rebuild these images very often, it doesn't seem like a big deal.
  • The UBI images use the dnf package manager with a very restricted set of repos and thus available packages to install.
    • Only a single version of Clang is available.
    • A follow-up PR can investigate whether we can add other repos (e.g. from Fedora) and have a wider range of Clang versions to install.
  • The UBI for RHEL 9.4 is no longer available, while the UBI for RHEL 10.0 does not yet contain any GCC toolset package.

Small improvements are made to the Debian and Ubuntu images as well for consistency.

@bthomee bthomee requested review from Bronek and legleux June 18, 2025 00:42
- os: 9.6
gcc: 13
- os: 9.6
gcc: 14
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't we also support 9.4 ? There's with GCC Toolset 13 for this release.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's no UBI base/tool image for 9.4, unfortunately. I can recreate it by looking at the RHEL UBI dockerfile and pull the same untagged image they use by its SHA, but I'm not so sure it's worth it given that we can support version 10 soon & supporting the last 2 versions should be more than adequate.

Copy link
Collaborator

@Bronek Bronek Jun 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can see https://catalog.redhat.com/software/containers/ubi9/ubi/615bcf606feffc5384e8452e?image=671a4613385185df173cf395&architecture=amd64 , but it's moot if there's no gcc version more recent than 11 (I do not know if there is)

Copy link
Contributor Author

@bthomee bthomee Jun 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh yeah I saw that one, but for some reason it didn't want to pull down with the 9.4 tag. Let me try again.

I'm using the s2i-base image - I didn't see one with the 9.4 tag. I recall there's one with a SHA but that made referencing it harder & it seems iffy if it doesn't have a proper tag.

@Bronek
Copy link
Collaborator

Bronek commented Jun 27, 2025

Instead of switching to name GITHUB_REGISTRY we could instead use CONTAINER_REGISTRY which, surprise, works just as well for any container registry. Same about the credentials used.

@bthomee bthomee requested a review from mathbunnyru June 27, 2025 12:57
@bthomee
Copy link
Contributor Author

bthomee commented Jun 27, 2025

Instead of switching to name GITHUB_REGISTRY we could instead use CONTAINER_REGISTRY which, surprise, works just as well for any container registry. Same about the credentials used.

Done for CONTAINER_REGISTRY although since the credentials are still specific to a certain registry there's now a bit of a mismatch between knowing which token is used for what. In any case, when you say "Same about the credentials used", what would you like to see? I don't think renaming GITHUB_TOKEN to CONTAINER_TOKEN or something along those lines is very helpful.

@bthomee
Copy link
Contributor Author

bthomee commented Jun 27, 2025

@mathbunnyru sent me the following to have the images conform to the following, so they can also be used in clio:

  • hadolint checks are fine with the following config: https://github.com/XRPLF/clio/blob/develop/.hadolint.yml
    • => Can you please provide an example for what you exactly would like to see in these Docker images to support hadolint?
  • Everything is written in one style
    • => That's the case here, but if you have specific requests let me know
  • Minimal (if some package isn't needed in the image, it shouldn't be there)
    • => That's the case here, see comments in the section where the pkgs are installed.
  • Carefully written multi-stage builds (with moving all the dependencies needed to build some tool we need out of final image)
    • => Pipx would be one of those, but removing it deletes Python too, which we want to keep. Installing Conan via Python was problematic due to how poorly it handles multi-user support (e.g. call as the non-root user).
    • => If you have specific packages you think should be removed, let me know and I can do so.
  • No hardcode of versions (ARGs)
    • => Done. I had to disable a warning check to have Docker scream at me; I also had to explicitly provide an (invalid) argument for GCC_VERSION in Debian as otherwise a FROM image isn't valid, even if it isn't used when building the Clang image.
  • Build times are reasonable
    • => Building the docker images is fast.
  • We have to support ubuntu:20.04 for now
    • => I'll add this in a follow-up PR; here I'm adding RHEL images and updating the Debian and Ubuntu Dockerfiles based on new insights.
  • We have build tools image, where some tools whose binaries are not easily installable are downloaded/built
    • => Can you please point me to the binaries that should be included?
  • We have to build custom GCC compiler
    • => Please provide more details. I'm sure @Bronek will have an opinion too.
  • To make it build fast we build separately for x86_64 and aarch64 and then merge it
    • => Can you please point me to an example where you do this?
  • Conan configuration is done nicely, including sanitizers
    • => Can you please point me to an example where you do this?

@@ -0,0 +1,93 @@
name: RHEL
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should create a reusable workflow and just call it from these workflows, instead of copying files with minor changes

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest a separate PR after this one; also if you could prepare it that would be great.

Copy link
Collaborator

@Bronek Bronek Jun 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... also, I am not convinced that the complexity would be justified (we have different set of compilers for each distro). This repo does not do anything else but just keep ci container images, I think having separate workflow for each supported Linux distro is a nice readability feature.

I wonder if it would be also simpler. I thought maybe not (hence the older version of this comment, crossed out above) but on second thought ... maybe yes ? Please show us 🙂

Comment on lines +31 to +33
# Install Conan.
ARG CONAN_VERSION
RUN pip install conan==${CONAN_VERSION}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably can be installed for a ci user only

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately I still can't get it to work.

Even something like this:

USER ${NONROOT_USER}
WORKDIR /home/${NONROOT_USER}
ARG CONAN_VERSION
RUN pip install conan==${CONAN_VERSION}
USER root

keeps trying to install it in a root-owned directory:

 > [base 8/8] RUN pip install conan==2.17.0:                                                                                                                                  
0.189 WARNING: The directory '/opt/app-root/src/.cache/pip' or its parent directory is not owned or is not writable by the current user. The cache has been disabled. Check the permissions and owner of that directory. If executing pip with sudo, you should use sudo's -H flag. 

ERROR: Could not install packages due to an OSError: [Errno 13] Permission denied: '/opt/app-root/src/.local'`

Should pip itself be installed differently?

Comment on lines +50 to +52
# Fix the Conan user home directory as it otherwise will point to the
# /opt/app-root/src/.conan2 directory.
ENV CONAN_HOME=/home/${NONROOT_USER}/.conan2
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Who sets it to /opt/app-root/src/.conan2?
Probably should remove it, instead of patching it here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was surprised too that the home directory didn't update after switching to the non-root user. At this point I don't know why it retains the root user's directory - I'm not very familiar with how the Red Hat / Fedora family does things.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see how it is set at all?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bthomee I encourage you to install conan python package for non-root user, and it's likely you won't have to change this at all.

Comment on lines +74 to +76
# Fix the Conan user home directory as it otherwise will point to the
# /opt/app-root/src/.conan2 directory.
ENV CONAN_HOME=/home/${NONROOT_USER}/.conan2
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is done for both gcc/clang, if you can't remove it, at least I think we should move it to base

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It feels awkward to me to set an environment variable to the home directory of a different user (i.e. the non-root user) than the one that is currently executing the commands (i.e. the root user). If you don't think it matters, I can move this up.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But you can switch to non-root user in base as well

@mathbunnyru
Copy link
Collaborator

@mathbunnyru sent me the following to have the images conform to the following, so they can also be used in clio:

hadolint is not something supported or not supported by Docker image.
it's a linter which you should run on an image to check it follows best practices.

  • Everything is written in one style

    • => That's the case here, but if you have specific requests let me know

I've checked and EOF statements are not always in the same style.

  • Minimal (if some package isn't needed in the image, it shouldn't be there)

    • => That's the case here, see comments in the section where the pkgs are installed.

build-essential is probably hiding what you need and installs lots of tools you don't need.

  • Carefully written multi-stage builds (with moving all the dependencies needed to build some tool we need out of final image)

    • => Pipx would be one of those, but removing it deletes Python too, which we want to keep. Installing Conan via Python was problematic due to how poorly it handles multi-user support (e.g. call as the non-root user).
    • => If you have specific packages you think should be removed, let me know and I can do so.

I don't know which packages you use in rippled, so I can't help here

  • No hardcode of versions (ARGs)

    • => Done. I had to disable a warning check to have Docker scream at me; I also had to explicitly provide an (invalid) argument for GCC_VERSION in Debian as otherwise a FROM image isn't valid, even if it isn't used when building the Clang image.

👌

  • Build times are reasonable

    • => Building the docker images is fast.

I think that's because you don't build aarch64 images and we do.
And we build them natively when it's slow.

  • We have to support ubuntu:20.04 for now

    • => I'll add this in a follow-up PR; here I'm adding RHEL images and updating the Debian and Ubuntu Dockerfiles based on new insights.

👌

  • We have build tools image, where some tools whose binaries are not easily installable are downloaded/built

    • => Can you please point me to the binaries that should be included?

https://github.com/XRPLF/clio/blob/develop/docker/tools/Dockerfile

  • We have to build custom GCC compiler

    • => Please provide more details. I'm sure @Bronek will have an opinion too.

This is how we build our GCC: https://github.com/XRPLF/clio/tree/develop/docker/compilers/gcc

  • To make it build fast we build separately for x86_64 and aarch64 and then merge it

    • => Can you please point me to an example where you do this?

https://github.com/XRPLF/clio/blob/develop/.github/workflows/update_docker_ci.yml#L103

  • Conan configuration is done nicely, including sanitizers

    • => Can you please point me to an example where you do this?

https://github.com/XRPLF/clio/blob/develop/.github/workflows/update_docker_ci.yml#L103

@Bronek
Copy link
Collaborator

Bronek commented Jun 30, 2025

  • Minimal (if some package isn't needed in the image, it shouldn't be there)

    • => That's the case here, see comments in the section where the pkgs are installed.

build-essential is probably hiding what you need and installs lots of tools you don't need.

I do not see build-essential in this branch.

  • Build times are reasonable

    • => Building the docker images is fast.

I think that's because you don't build aarch64 images and we do. And we build them natively when it's slow.

I think this means we need to build them as well, if we want clio to use these images. Future PR welcome.

  • We have to support ubuntu:20.04 for now

    • => I'll add this in a follow-up PR; here I'm adding RHEL images and updating the Debian and Ubuntu Dockerfiles based on new insights.

👌

This old version won't be supported by future releases of libxrpl, so maybe clio team need not bother ?|

  • We have to build custom GCC compiler

    • => Please provide more details. I'm sure @Bronek will have an opinion too.

This is how we build our GCC: https://github.com/XRPLF/clio/tree/develop/docker/compilers/gcc

So this similar to how we add gcc to our Debian images - however instead of building it, we rely on GCC team to do this for us, and just copy from official image. I see why you do it (there's no gcc-12 for Ubuntu Focal) but as I said above, libxrpl won't support Focal in future release, so probably it's pointless now.

  • To make it build fast we build separately for x86_64 and aarch64 and then merge it

    • => Can you please point me to an example where you do this?

https://github.com/XRPLF/clio/blob/develop/.github/workflows/update_docker_ci.yml#L103

Another example here https://github.com/libfn/functional/blob/main/.github/workflows/ci-build.yml

@legleux
Copy link
Collaborator

legleux commented Jul 1, 2025

@bthomee

* Enabled multi-platform builds.
  
  * ... but it looks like the runners can't handle it. I'll have to look at how Bronek used multiple runners of different architectures (in our case that would be `ubuntu-24.04` for AMD64 and `ubuntu-24.04-arm` for ARM64), and then using the merge Docker action afterwards.

This is more or less how I handled the multi-arch builds when it was in GitLab. I had some issues getting the tags correct with the GitHub action a while back, maybe it works better now.

https://gist.github.com/legleux/d7e023232415eef6f76c219623dd5aa6

@Bronek Bronek force-pushed the bthomee/rhel branch 8 times, most recently from f4bfed4 to 056bef5 Compare July 1, 2025 14:20
@Bronek
Copy link
Collaborator

Bronek commented Jul 1, 2025

Enabled multi-platform builds.

  • ... but it looks like the runners can't handle it. I'll have to look at how Bronek used multiple runners of different architectures (in our case that would be ubuntu-24.04 for AMD64 and ubuntu-24.04-arm for ARM64), and then using the merge Docker action afterwards.

thanks, this now works.

@bthomee bthomee dismissed legleux’s stale review July 1, 2025 16:52

Addressed main comments. Follow-up PR can address any remaining issues.

@Bronek Bronek force-pushed the bthomee/rhel branch 2 times, most recently from f168340 to 6ba2d3c Compare July 1, 2025 17:01
@bthomee bthomee merged commit 6ccb462 into main Jul 1, 2025
38 checks passed
@bthomee bthomee deleted the bthomee/rhel branch July 1, 2025 17:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants