NO-JIRA: Switch layered build to treefile-apply, drain get-ocp-repo.sh#1780
NO-JIRA: Switch layered build to treefile-apply, drain get-ocp-repo.sh#1780jlebon merged 11 commits intoopenshift:masterfrom
treefile-apply, drain get-ocp-repo.sh#1780Conversation
|
@jlebon: This pull request explicitly references no jira issue. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
Man, this |
okd-c9s varianttreefile-apply, drain get-ocp-repo.sh
|
This requires coreos/rpm-ostree#5351 and coreos/coreos-assembler#4054. |
|
cc @Prashanth684 since this also touches OKD |
jbtrystram
left a comment
There was a problem hiding this comment.
Awesome !
Just a couple of questions :)
| - `rhel-9.6`: RHEL 9.6-based CoreOS; without OpenShift components. | ||
| - `ocp-rhel-9.6`: RHEL 9.6-based CoreOS; including OpenShift components. | ||
| - `c9s`: CentOS Stream-based CoreOS, without OKD components. | ||
| - `okd-c9s`: CentOS Stream-based CoreOS, including OpenShift components. This |
There was a problem hiding this comment.
I still see the okd-c9s variant used in the okd/scos build pipeline [1] run in MOC and more specifically in the latest commit okd-project/okd-coreos-pipeline@d4be53e for 4.19.
But according to openshift/release#62296 the scos imagestream (to be used as node image) is now populated by the OpenShift CI itself instead of the MOC pipeline.
So maybe we should decommission the MOC pipeline [2] before merging this patch ? What do you think @Prashanth684 ? It's not a blocker though, the MOC builds would just fail and can be deal as a follow-up.
[1] https://github.com/search?q=repo%3Aokd-project%2Fokd-coreos-pipeline%20okd-c9s&type=code
[2] https://github.com/okd-project/okd-coreos-pipeline
There was a problem hiding this comment.
The okd-c9s "variant" used in that pipeline is for the extensions build, not the base OS AFAIK.
That said, there is indeed a small cleanup possible there which is that it no longer needs to provide a VARIANT argument to the extensions build since that's auto-detected now.
There was a problem hiding this comment.
So maybe we should decommission the MOC pipeline [2] before merging this patch ? What do you think @Prashanth684 ? It's not a blocker though, the MOC builds would just fail and can be deal as a follow-up.
Correct. MOC is only used for 4.18. Once we release 4.19 as stable, we will stop those also. We are working to migrate off MOC (we still do OKD release promotions from there) to an internal cluster.
|
It looks like the CI base image does not contain the |
Yeah, it needed coreos/coreos-assembler#4054. That's merged now but there's no point retriggering the tests until coreos/rpm-ostree#5351 percolates down too. We're working on that. |
|
/retest |
|
OK, we have new enough rpm-ostree and cosa now. Let's try this out! |
|
@Prashanth684 Can you take of this bit once this lands:
|
dustymabe
left a comment
There was a problem hiding this comment.
mostly LGTM - some comments
| # buildah doesn't seem to support heredoc output | ||
| # redirection like buildkit so do it manually here |
There was a problem hiding this comment.
is there a bug for this we can link to?
There was a problem hiding this comment.
I didn't find one, but I could file one I guess.
Ahhh OK, just saw your other comment below related to this. The feature I'm talking about here isn't just bash redirection, but a RUN feature so you can write e.g.
RUN <<EOF > /out
echo foobar
and it'll go to /out. Which would've been perfect for our use case here.
| MANIFEST="manifest-rhel-9.6.yaml" | ||
| EXTENSIONS="extensions-ocp-rhel-9.6.yaml" |
There was a problem hiding this comment.
Does this imply these paths will need to be updated each time we say bump to the next version of RHEL? (i.e. RHEL 9.7 or 9.8 are here?)
I would kind of prefer that we didn't have to remember to update these variables when that happens.
There was a problem hiding this comment.
Not 9.7, but 9.8 yeah. It would be part of all the tree updates we do after branching for 4.22 I guess? When bumping the base variant from rhel-9.6 to rhel-9.8. Though by then we should be on top of rhel-bootc which likely means we can rework this too since it's then driven by the base RHEL version of the rhel-bootc image.
|
Hmm, the builder is choking on the heredocs. I think possibly the Dockerfile parser there (which happens before it's handed off to buildah) is getting tripped up on something. :( I'll dig into this a bit, but if it can't be easily worked around, I think I'll just add a commit for now that moves the heredocs to shell scripts for now. c9s builds are failing on which I have no idea why. It's using new enough cosa with coreos/coreos-assembler#4054. And using that new cosa locally I can't reproduce this. |
joelcapitao
left a comment
There was a problem hiding this comment.
See my inline comment, sorry I missed it in my previous review.
Otherwise LGTM 👍
It's because the latest run CI job pull cosa commit b45a4066b16a2332517f659111d4f474372f77d5 [1] and not the latest one with the changes we want coreos/coreos-assembler@ae0e86a Maybe some cache issue or race condition issue ? I'm not yet familiar with the CI workflow to be sure, maybe a simple /retest should work |
Hmm forget about it, wrong assumption, the codebase up to b45a4 contains the changes we want from https://github.com/coreos/coreos-assembler/pull/4054/commits |
|
OK, coreos/coreos-assembler#4059 should fix the GPG issue! |
|
/retest |
100d9f8 to
ebd3ff7
Compare
Now that (1) we've reworked the layered node image build to only enable the repos it needs, and (2) we've simplified the CentOS Stream GPG keys, we can delete all of the complex logic in this repo. It basically just boils down to curl'ing down all the repo files we may need to build the various artifacts that use this script.
We only want certain packages to come from the 4.19 plashet. And we can't just rely on NVRs because the plashet may sometimes win. Long-term we should sever that dependence on ART packages, but for now, let's add a hack to essentially generate a repo on the fly from the 4.19 repo with the filters we need. The advantage of doing it this way instead of e.g. in the `get-ocp-repo.sh` script is that this applies both in CI and locally.
The OCP builder API path isn't parsing the heredoc correctly for some
reason:
error: build error: EOF: unterminated heredoc
This will be fixed by openshift/builder#469.
Anyway, just work around this for now by moving all the logic to
scripts. It does make the Containerfiles cleaner at least now that it
has gotten so larger and we get syntax highlighting, ShellCheck, etc...
so probably for the best.
Before we inherited this from the ocp-rhel-9.6 manifest. But now that we're inheriting from the rhel-9.6 manifest, that repo isn't enabled by default there since it's not strictly needed (because we don't ship openvswitch in the base). So we need to enable it here ourselves.
|
OK, the OKD node image build is failing on
which I know why. Got a fix for that, but let's see if CI for the other tests pass. If they do, then let's get this in, and I'll add my fix to #1498 instead when I rebase it. |
|
/lgtm |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: dustymabe, jbtrystram, jlebon The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
This will known fail. It'll be fixed in #1498. /override ci/prow/okd-scos-images OK, as for the RHCOS failure, it seems to have just... timed out? Nothing in particular in the journal logs. Last operation was So possibly a genuine timeout because of e.g. slow I/O. Anyway, don't think it needs to block this. It'll rerun in #1498. |
|
@jlebon: Overrode contexts on behalf of jlebon: ci/prow/okd-scos-images DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
@jlebon: The following tests failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
We now need to support both EL9 and EL10. Using the conditionnal includes for treefiles added in [1], update `osversion` to contain the variant (centos/rhel) and the major version. This allows the layered build to source `/etc/release` and include the correct repos. Update denylist entries to matcht that. [1] openshift#1780
We now need to support both EL9 and EL10. Using the conditionnal includes for treefiles added in [1], update `osversion` to contain the variant (centos/rhel) and the major version. This allows the layered build to source `/etc/release` and include the correct repos. Update denylist entries to matcht that. [1] openshift#1780
We now need to support both EL9 and EL10. Using the conditionnal includes for treefiles added in [1], update `osversion` to contain the variant (centos/rhel) and the major version. This allows the layered build to source `/etc/release` and include the correct repos. Update denylist entries to matcht that. [1] openshift#1780
We now need to support both EL9 and EL10. Using the conditionnal includes for treefiles added in [1], update `osversion` to contain the variant (centos/rhel) and the major version. This allows the layered build to source `/etc/release` and include the correct repos. Update denylist entries to matcht that. [1] openshift#1780
We now need to support both EL9 and EL10. Using the conditionnal includes for treefiles added in [1], update `osversion` to contain the variant (centos/rhel) and the major version. This allows the layered build to source `/etc/release` and include the correct repos. Update denylist entries to matcht that. [1] openshift#1780
We now need to support both EL9 and EL10. Using the conditionnal includes for treefiles added in [1], update `osversion` to contain the variant (centos/rhel) and the major version. This allows the layered build to source `/etc/release` and include the correct repos. Update denylist entries to matcht that. [1] openshift#1780
We now need to support both EL9 and EL10. Using the conditionnal includes for treefiles added in [1], update `osversion` to contain the variant (centos/rhel) and the major version. This allows the layered build to source `/etc/release` and include the correct repos. Update denylist entries to matcht that. [1] openshift#1780
We now need to support both EL9 and EL10. Using the conditionnal includes for treefiles added in [1], update `osversion` to contain the variant (centos/rhel) and the major version. This allows the layered build to source `/etc/release` and include the correct repos. Update denylist entries to matcht that. [1] openshift#1780
Since [1], the extension build detects automatically the release based on the builder image [2]. So this argument can be removed. [1] openshift/os#1780 [2] https://github.com/openshift/os/blob/master/extensions/build.sh#L9-L23
Since [1], the extension build detects automatically the release based on the builder image [2]. So this argument can be removed. [1] openshift/os#1780 [2] https://github.com/openshift/os/blob/master/extensions/build.sh#L9-L23
Since [1], the extension build detects automatically the release based on the builder image [2]. So this argument can be removed. [1] openshift/os#1780 [2] https://github.com/openshift/os/blob/master/extensions/build.sh#L9-L23
Since [1], the extension build detects automatically the release based on the builder image [2]. So this argument can be removed. [1] openshift/os#1780 [2] https://github.com/openshift/os/blob/master/extensions/build.sh#L9-L23
See individual commit messages.