Skip to content

Conversation

@openshift-cherrypick-robot

This is an automated cherry-pick of #12492

/assign theobarberbany

Install the azure and gcp image registry credential providers, that are
required from 4.16.
@openshift-ci-robot
Copy link

openshift-ci-robot commented Apr 9, 2024

@openshift-cherrypick-robot: Ignoring requests to cherry-pick non-bug issues: OCPCLOUD-2484, OCPCLOUD-2481

Details

In response to this:

This is an automated cherry-pick of #12492

/assign theobarberbany

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@theobarberbany
Copy link
Contributor

theobarberbany commented Apr 9, 2024

/retitle [release-4.15] OCPCLOUD-2484,OCPCLOUD-2481,OCPBUGS-30970: Adds azure and gcp image credential providers

@openshift-ci openshift-ci bot changed the title [release-4.15] OCPCLOUD-2484,OCPCLOUD-2481: Adds azure and gcp image credential providers [release-4.15] OCPCLOUD-2484,OCPCLOUD-2481,OCPBUGS-30970: Adds azure and gcp image credential providers Apr 9, 2024
@openshift-ci-robot openshift-ci-robot added jira/severity-important Referenced Jira bug's severity is important for the branch this PR is targeting. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. jira/invalid-bug Indicates that a referenced Jira bug is invalid for the branch this PR is targeting. labels Apr 9, 2024
@openshift-ci-robot
Copy link

openshift-ci-robot commented Apr 9, 2024

@openshift-cherrypick-robot: This pull request references OCPCLOUD-2484 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target either version "4.15." or "openshift-4.15.", but it targets "4.16" instead.

This pull request references OCPCLOUD-2481 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target either version "4.15." or "openshift-4.15.", but it targets "4.16" instead.

This pull request references Jira Issue OCPBUGS-30970, which is invalid:

  • expected the bug to target the "4.15.z" version, but no target version was set
  • expected Jira Issue OCPBUGS-30970 to depend on a bug targeting a version in 4.16.0 and in one of the following states: VERIFIED, RELEASE PENDING, CLOSED (ERRATA), CLOSED (CURRENT RELEASE), CLOSED (DONE), CLOSED (DONE-ERRATA), but no dependents were found

Comment /jira refresh to re-evaluate validity if changes to the Jira bug are made, or edit the title of this pull request to link to a different bug.

The bug has been updated to refer to the pull request using the external bug tracker.

Details

In response to this:

This is an automated cherry-pick of #12492

/assign theobarberbany

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci openshift-ci bot requested review from barbacbd and patrickdillon April 9, 2024 09:17
@theobarberbany
Copy link
Contributor

/retitle [release-4.15] OCPBUGS-30970: Adds azure and gcp image credential providers

@openshift-ci openshift-ci bot changed the title [release-4.15] OCPCLOUD-2484,OCPCLOUD-2481,OCPBUGS-30970: Adds azure and gcp image credential providers [release-4.15] OCPBUGS-30970: Adds azure and gcp image credential providers Apr 9, 2024
@theobarberbany
Copy link
Contributor

This adds ose-azure-acr-image-credential-provider and ose-gcp-gcr-image-credential-provider to release-4.15. This means the packages are already available when the upgrade from 4.15 -> 4.16 takes place.

Currrently, kubelet can fail to start :

7755 kuberuntime_manager.go:273] "Failed to register CRI auth plugins" err="plugin binary executable /usr/libexec/kubelet-image-credential-provider-plugins/acr-credential-provider did not exist"

This is because the RHEL worker upgrade can happen after the cluster upgrade.

@gpei
Copy link
Contributor

gpei commented Apr 9, 2024

/label cherry-pick-approved

@openshift-ci openshift-ci bot added the cherry-pick-approved Indicates a cherry-pick PR into a release branch has been approved by the release branch manager. label Apr 9, 2024
@gpei
Copy link
Contributor

gpei commented Apr 9, 2024

@theobarberbany the e2e-aws-workers-rhel8 job failed for the two packages are not available in 4.15 ocp repo.
I had a check in brew, ART team only built the two packages for 4.16,
https://brewweb.engineering.redhat.com/brew/packageinfo?packageID=85785
https://brewweb.engineering.redhat.com/brew/packageinfo?packageID=85784
so we don't have them available in 4.15 OCP repo.
Do you want me to create a ticket to ART team for requesting the 4.15 build?

@gpei
Copy link
Contributor

gpei commented Apr 9, 2024

/hold

@openshift-ci openshift-ci bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Apr 9, 2024
@theobarberbany
Copy link
Contributor

theobarberbany commented Apr 9, 2024

@gpei That would be great. The choice to not have them in 4.15 was deliberate, but seemingly the wrong one now.

When I did this work, I wasn't too familiar with how the upgrade process works and assumed that the workers would always be upgraded first, avoiding this problem!

If you slack it to me I will message the member of ART team who dealt with this previously, as they will have context :)

@gpei
Copy link
Contributor

gpei commented Apr 10, 2024

assumed that the workers would always be upgraded first

yeah, that's true for the RHCOS workers, but RHEL worker is an exception, they're not completely managed by the MCO, so it's very troublesome.

@barbacbd
Copy link
Contributor

/jira refresh

@openshift-bot
Copy link

@barbacbd: This pull request references Jira Issue OCPBUGS-30970, which is invalid:

  • release note text must be set and not match the template OR release note type must be set to "Release Note Not Required"
  • expected dependent Jira Issue OCPBUGS-32057 to be in one of the following states: VERIFIED, RELEASE PENDING, CLOSED (ERRATA), CLOSED (CURRENT RELEASE), CLOSED (DONE), CLOSED (DONE-ERRATA), but it is MODIFIED instead
  • expected dependent Jira Issue OCPBUGS-32057 to be in one of the following states: VERIFIED, RELEASE PENDING, CLOSED (ERRATA), CLOSED (CURRENT RELEASE), CLOSED (DONE), CLOSED (DONE-ERRATA), but it is MODIFIED instead

Comment /jira refresh to re-evaluate validity if changes to the Jira bug are made, or edit the title of this pull request to link to a different bug.

Details

In response to this:

/jira refresh

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@barbacbd
Copy link
Contributor

/jira refresh

@openshift-bot openshift-bot added jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. and removed jira/invalid-bug Indicates that a referenced Jira bug is invalid for the branch this PR is targeting. labels Apr 10, 2024
@openshift-bot
Copy link

@barbacbd: This pull request references Jira Issue OCPBUGS-30970, which is valid. The bug has been moved to the POST state.

9 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target version (4.15.z) matches configured target version for branch (4.15.z)
  • bug is in the state ASSIGNED, which is one of the valid states (NEW, ASSIGNED, POST)
  • release note text is set and does not match the template
  • dependent bug Jira Issue OCPBUGS-32057 is in the state Closed (Done), which is one of the valid states (VERIFIED, RELEASE PENDING, CLOSED (ERRATA), CLOSED (CURRENT RELEASE), CLOSED (DONE), CLOSED (DONE-ERRATA))
  • dependent bug Jira Issue OCPBUGS-32057 is in the state Closed (Done), which is one of the valid states (VERIFIED, RELEASE PENDING, CLOSED (ERRATA), CLOSED (CURRENT RELEASE), CLOSED (DONE), CLOSED (DONE-ERRATA))
  • dependent Jira Issue OCPBUGS-32057 targets the "4.16.0" version, which is one of the valid target versions: 4.16.0
  • dependent Jira Issue OCPBUGS-32057 targets the "4.16.0" version, which is one of the valid target versions: 4.16.0
  • bug has dependents

Requesting review from QA contact:
/cc @sunzhaohua2

Details

In response to this:

/jira refresh

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Apr 15, 2024

@barbacbd: The specified target(s) for /test were not found.
The following commands are available to trigger required jobs:

  • /test e2e-aws-workers-rhel8
  • /test images
  • /test unit

Use /test all to run all jobs.

Details

In response to this:

/test ci/prow/e2e-aws-workers-rhel8

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@gpei
Copy link
Contributor

gpei commented Apr 16, 2024

@barbacbd just FYI, it's still pending on ticket https://issues.redhat.com/browse/ART-9396

@theobarberbany
Copy link
Contributor

/test e2e-aws-workers-rhel8

@theobarberbany
Copy link
Contributor

/test e2e-aws-workers-rhel8

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Apr 19, 2024

@openshift-cherrypick-robot: all tests passed!

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@barbacbd
Copy link
Contributor

/hold cancel

@openshift-ci openshift-ci bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Apr 19, 2024
Copy link
Contributor

@barbacbd barbacbd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
/approve

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Apr 19, 2024
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Apr 19, 2024

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: barbacbd, openshift-cherrypick-robot

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Apr 19, 2024
@theobarberbany
Copy link
Contributor

/label backport-risk-assessed

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Apr 19, 2024

@theobarberbany: Can not set label backport-risk-assessed: Must be member in one of these teams: []

Details

In response to this:

/label backport-risk-assessed

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@sdodson
Copy link
Member

sdodson commented Apr 19, 2024

/label backport-risk-assessed

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Apr 19, 2024

@sdodson: Can not set label backport-risk-assessed: Must be member in one of these teams: []

Details

In response to this:

/label backport-risk-assessed

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@sdodson sdodson added the backport-risk-assessed Indicates a PR to a release branch has been evaluated and considered safe to accept. label Apr 19, 2024
@openshift-merge-bot openshift-merge-bot bot merged commit 54692e9 into openshift:release-4.15 Apr 19, 2024
@openshift-ci-robot
Copy link

@openshift-cherrypick-robot: Jira Issue OCPBUGS-30970: All pull requests linked via external trackers have merged:

Jira Issue OCPBUGS-30970 has been moved to the MODIFIED state.

Details

In response to this:

This is an automated cherry-pick of #12492

/assign theobarberbany

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@gpei
Copy link
Contributor

gpei commented Jun 5, 2025

/cherry-pick release-4.14

@openshift-cherrypick-robot
Copy link
Author

@gpei: new pull request created: #12531

Details

In response to this:

/cherry-pick release-4.14

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@gpei
Copy link
Contributor

gpei commented Jun 5, 2025

While debugging on the 4.14 to 4.16 CPOU upgrade test with RHEL workers in openshift/release#65478, I realized that we might need to install these two packages in version 4.14 too, because if we directly upgrade the RHEL node from version 4.14 to 4.16 in CPOU upgrade workflow, the kubelet service will fail to start due to the absence of the two packages on Azure/GCP.

@theobarberbany
Copy link
Contributor

theobarberbany commented Jun 5, 2025

I think that makes sense.... do we allow customers to jump 4.14 -> 4.16 without going through 4.15?

Now I have to remember all the context around this. I'm not sure we're building azure & gcp credential providers for 4.14...

edit: i've just seen the PR + ticket requesting builds :) those will need .spec file backports

@gpei can you check if it works if upgrading 4.14 -> 4.15 -> 4.16?

@gpei
Copy link
Contributor

gpei commented Jun 5, 2025

@theobarberbany hi, thanks for looking into this

I think that makes sense.... do we allow customers to jump 4.14 -> 4.16 without going through 4.15?

This is available in an upgrade way called Control Plane Only update(previously known as an EUS-to-EUS update)
https://docs.redhat.com/en/documentation/openshift_container_platform/4.16/html/updating_clusters/performing-a-cluster-update#control-plane-only-update, in this workflow, we can update the workers from 4.y to 4.y+2 directly, this is the scenario where we encountered the error.

To update your worker nodes to <4.y+2>, unpause all previously paused machine config pools by running the following command:
oc patch mcp/worker --type merge --patch '{"spec":{"paused":false}}'

Now I have to remember all the context around this. I'm not sure we're building azure & gcp credential providers for 4.14...

edit: i've just seen the PR + ticket requesting builds :) those will need .spec file backports

@gpei can you check if it works if upgrading 4.14 -> 4.15 -> 4.16?

yeah, I think it should work well for such step-by-step upgrade(I haven't tested it yet, and we don't have such CI job, but I can give a try later), because in 4.15 we don't require those packages yet, and while upgrading to 4.15, the packages would be installed on the RHEL node, and then it would be safe to moved to 4.16.

such as in the RHEL worker upgrading from 4.14 to 4.15 step
https://gcsweb-qe-private-deck-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/qe-private-deck/logs/periodic-ci-openshift-openshift-tests-private-release-4.15-amd64-nightly-4.15-upgrade-from-stable-4.14-azure-ipi-proxy-workers-rhcos-rhel8-f28/1921315176863240192/artifacts/azure-ipi-proxy-workers-rhcos-rhel8-f28/cucushift-upgrade-rhel-worker/build-log.txt
the packages were installed.

"Installed: ose-aws-ecr-image-credential-provider-4.15.0-202504121004.p0.gfd77d92.assembly.stream.el8.x86_64", "Installed: ose-azure-acr-image-credential-provider-4.15.0-202504121004.p0.g0d799a2.assembly.stream.el8.x86_64", "Installed: ose-gcp-gcr-image-credential-provider-4.15.0-202504121004.p0.gfc50272.assembly.stream.el8.x86_64",

@theobarberbany
Copy link
Contributor

ah ok - this makes sense.

Currently, we don't have the required .spec files in cloud-provider-azure and cloud-provider-gcp for builds to be made by art. I'll need to backport those. Have you got a bug to gather these under?

@gpei
Copy link
Contributor

gpei commented Jun 5, 2025

Currently, we don't have the required .spec files in cloud-provider-azure and cloud-provider-gcp for builds to be made by art

I'm not aware of this, for 4.14.z bug, the bot just cloned https://issues.redhat.com//browse/OCPBUGS-57111 for the openshift-ansible PR backporting.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. backport-risk-assessed Indicates a PR to a release branch has been evaluated and considered safe to accept. cherry-pick-approved Indicates a cherry-pick PR into a release branch has been approved by the release branch manager. jira/severity-important Referenced Jira bug's severity is important for the branch this PR is targeting. jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. lgtm Indicates that a PR is ready to be merged.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants