-
Notifications
You must be signed in to change notification settings - Fork 2k
CORS-3991: Add private cluster coverage for aws pre-submit jobs #65357
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@jianlinliu: This pull request references CORS-3991 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.20.0" version, but no target version was set. DetailsIn response to this: Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
/test ci-operator-config |
|
/pj-rehearse pull-ci-openshift-installer-release-4.20-e2e-aws-private |
|
@jianlinliu: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel. |
|
/pj-rehearse pull-ci-openshift-installer-release-4.20-e2e-aws-private |
|
@jianlinliu: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel. |
|
cc @tthvo and @yunjiang29 to review. For the failure in the e2e job, sounds like some e2e test cases are not applicable for private clusters, or those e2e cases need to update to work with private clusters. I am not sure if that is blocker to merge this PR. |
tthvo
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the failure in the e2e job, sounds like some e2e test cases are not applicable for private clusters, or those e2e cases need to update to work with private clusters
Right, if I understand it correctly, some LoadBalancer Services created by tests are external so no subnets (i.e. all private) cannot support such Services. And the CCM throws error below.
And those services should have service.beta.kubernetes.io/aws-load-balancer-internal: true if private cluster.
2025-05-27T07:52:48.779302762Z I0527 07:52:48.779254 1 controller.go:401] Ensuring load balancer for service e2e-service-lb-test-tqvbf/service-test
2025-05-27T07:52:48.779352103Z I0527 07:52:48.779314 1 aws.go:2153] EnsureLoadBalancer(kubernetes, e2e-service-lb-test-tqvbf, service-test, us-west-2, , [{ TCP 80 {0 80 } 32066}], map[service.beta.kubernetes.io/aws-load-balancer-healthcheck-healthy-threshold:2 service.beta.kubernetes.io/aws-load-balancer-healthcheck-interval:8 service.beta.kubernetes.io/aws-load-balancer-healthcheck-unhealthy-threshold:3])
2025-05-27T07:52:48.779440555Z I0527 07:52:48.779406 1 event.go:389] "Event occurred" object="e2e-service-lb-test-tqvbf/service-test" fieldPath="" kind="Service" apiVersion="v1" type="Normal" reason="EnsuringLoadBalancer" message="Ensuring load balancer"
2025-05-27T07:52:49.297880635Z I0527 07:52:49.297825 1 aws.go:1666] Ignoring private subnet for public ELB "subnet-0d4e48fbd321b5601"
2025-05-27T07:52:49.297880635Z I0527 07:52:49.297847 1 aws.go:1666] Ignoring private subnet for public ELB "subnet-0b210adbabc637755"
2025-05-27T07:52:49.297880635Z I0527 07:52:49.297853 1 aws.go:1666] Ignoring private subnet for public ELB "subnet-0bec46812ac6747b5"
2025-05-27T07:52:49.297909266Z E0527 07:52:49.297896 1 controller.go:301] "Unhandled Error" err="error processing service e2e-service-lb-test-tqvbf/service-test (retrying with exponential backoff): failed to ensure load balancer: could not find any suitable subnets for creating the ELB" logger="UnhandledError"
2025-05-27T07:52:49.297976907Z I0527 07:52:49.297936 1 event.go:389] "Event occurred" object="e2e-service-lb-test-tqvbf/service-test" fieldPath="" kind="Service" apiVersion="v1" type="Warning" reason="SyncLoadBalancerFailed" message="Error syncing load balancer: failed to ensure load balancer: could not find any suitable subnets for creating the ELB"
2025-05-27T07:57:20.722143960Z I0527 07:57:20.721314 1 controller.go:958] Removing finalizer from service e2e-deployment-8004/test-rolling-update-with-lb
Though, this looks good to me! Do we plan to have this for 4.19 too?
Thanks for your analysis, this reminds me we have a similar step
Yeah, that was the original plan. But during #64371, @jinyunma found some e2e bugs, and fixed them in openshift/origin#29759, while the PR was landed into 4.20, did not backport to 4.19 yet, so the plan get changed, I am going to drop 4.19, do you have any advice ? |
|
@jianlinliu, Interacting with pj-rehearseComment: Once you are satisfied with the results of the rehearsals, comment: |
|
/pj-rehearse pull-ci-openshift-installer-release-4.20-e2e-aws-private |
|
@jianlinliu: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel. |
Oh right, it makes sense to hold off these jobs for 4.19 if openshift/origin#29759 is not yet backported. I guess we can revisit this when any bugs related to private clusters are found for 4.19 and we need testing. |
|
LGTM |
|
@jianlinliu: The following test failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
@tthvo After added And from your analysis, ccm is expecting public subnets, so the added steps did not help that. I think we can move forwards to get this merge, because the new job is optional, the e2e failure does not block any PR merge. Of course, if in the future, if we need the whole e2e test suite get passed, we can file jira issues to the individual component team to future enhancement. |
|
/pj-rehearse ack |
|
@jianlinliu: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel. |
tthvo
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
/approve
Looks great!
|
@vrutkovs do you mind to review and approve this PR ? |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: jianlinliu, tthvo, vrutkovs The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
18defac
into
openshift:master
No description provided.