-
Notifications
You must be signed in to change notification settings - Fork 18
test: refactor operatorpolicy tests to run faster #419
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
test: refactor operatorpolicy tests to run faster #419
Conversation
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: jan-law The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
e3ead21 to
4fb0eb9
Compare
| @@ -672,22 +832,28 @@ var _ = Describe("Testing OperatorPolicy", Ordered, Label("supports-hosted"), fu | |||
| ) | |||
| }) | |||
| }) | |||
|
|
|||
| Describe("Testing Subscription behavior for musthave mode while enforcing", Ordered, func() { | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test flakes occasionally. Both times have the same events, passed the Eventually() in checkFunc, but failed at Consistently(). The last It() step in the test intentionally breaks the spec/subscription/sourceNamespace, but then an install plan starts installing during Consistently(). The subscription status changes to Compliant instead of remaining at the expected ConstraintsNotSatisfiable.
Update: The previous step always installs the install plan and changes the subscription condition if the patch in the last step does not run fast enough. Thinking of ways to handle this
{
"lastTransitionTime": "2025-12-04T19:14:57Z",
"message": "the Subscription matches what is required by the policy",
"reason": "SubscriptionMatches",
"status": "True",
"type": "SubscriptionCompliant"
},
{
"lastTransitionTime": "2025-12-04T19:14:23Z",
"message": "the policy spec is valid",
"reason": "PolicyValidated",
"status": "True",
"type": "ValidPolicySpec"
}
],
"history": [
{
"lastTimestamp": "2025-12-04T19:14:58.015269Z", # unexpected
"message": "NonCompliant; the policy spec is valid, the OperatorGroup matches what is required by the policy, the Subscription matches what is required by the policy, a relevant InstallPlan is actively installing, the ClusterServiceVersion required by the policy was not found, no CRDs were found for the operator, there are no relevant deployments because the ClusterServiceVersion is missing, CatalogSource 'operatorhubio-catalog' was not found"
},
{
"lastTimestamp": "2025-12-04T19:14:55.701762Z", # this should be the last message
"message": "NonCompliant; the policy spec is valid, the OperatorGroup matches what is required by the policy, constraints not satisfiable: refer to the Subscription for more details, there are no relevant InstallPlans in the namespace, the ClusterServiceVersion required by the policy was not found, no CRDs were found for the operator, there are no relevant deployments because the ClusterServiceVersion is missing, CatalogSource 'operatorhubio-catalog' was not found"
},
...
wanted related objects: [{Properties:<nil> Object:{Metadata:{Name:project-quay Namespace:operator-policy-testns-124d3267} APIVersion:operators.coreos.com/v1alpha1 Kind:Subscription} Compliant:NonCompliant Reason:ConstraintsNotSatisfiable}]
wanted condition: {Type:SubscriptionCompliant Status:False ObservedGeneration:0 LastTransitionTime:0001-01-01 00:00:00 +0000 UTC Reason:ConstraintsNotSatisfiable Message:constraints not satisfiable: refer to the Subscription for more details}
...
[FAILED] Failed after 2.327s.
The function passed to Consistently failed at /home/runner/work/config-policy-controller/config-policy-controller/test/e2e/case38_install_operator_test.go:169 with:
Expected
<string>: Compliant
to equal
<string>: NonCompliant
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Going to skip the Consistently() check here
| utils.Kubectl("patch", "operatorpolicy", opPolName, "-n", testNamespace, "--type=json", "-p", | ||
| `[{"op": "replace", "path": "/spec/subscription/startingCSV", "value": "`+goodVersion+`"},`+ | ||
| `{"op": "replace", "path": "/spec/remediationAction", "value": "inform"}]`) | ||
| KubectlTarget("patch", "subscription.operator", subName, "-n", opPolTestNS, "--type=json", "-p", | ||
| `[{"op": "replace", "path": "/spec/startingCSV", "value": "`+goodVersion+`"}]`) | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On one test flake, the logs indicated the subscription wasn't patched, then timed out. Hopefully retrying helps. https://github.com/open-cluster-management-io/config-policy-controller/actions/runs/19972532656/job/57291714147#logs
2025-12-05T21:04:04.310Z debug controllers/operatorpolicy_controller.go:1711 ..."controllerKind": "OperatorPolicy", "OperatorPolicy": {"name":"oppol-manual-upgrades-e8e934d1","namespace":"managed"}, ... "subscriptionConditionMessage": "constraints not satisfiable: no operators found with name strimzi-cluster-operator.v0.0.0.1337 in channel strimzi-0.36.x of package strimzi-kafka-operator in the catalog referenced by subscription strimzi-kafka-operator, subscription strimzi-kafka-operator exists"}
9d7f873 to
0a659eb
Compare
f1b2e55 to
7c7d9ef
Compare
af29cc2 to
032ae67
Compare
| deploymentName = "apicast-operator-controller-manager-v2" | ||
| crdName = "apicasts.apps.3scale.net" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: Anyone who add tests to this file needs to be very careful if they decide to create an APIcast or Authorino operator, as I've used those for cluster-scoped tests. As far as I know, the other operators in the tests can be reused in parallel
7b3a27f to
5adf9c1
Compare
These tests intentionally break the CatalogSource and expect the PackageManifests to disappear. Run as Serial to prevent interference from tests that frequently query the PackageManifest API Reduce polling frequency to prevent interference from cached results ref: https://issues.redhat.com/browse/ACM-23597 Signed-off-by: Janelle Law <[email protected]>
ref: https://issues.redhat.com/browse/ACM-26529 Signed-off-by: Janelle Law <[email protected]>
Prepare to refactor the tests that enforce an operatorpolicy for the same operator while relying on the default operator group behavior ref: https://issues.redhat.com/browse/ACM-26529 Signed-off-by: Janelle Law <[email protected]>
These tests deploy the same operator with no OperatorGroup, which defaults to AllNamespaces mode. We need to run these tests one at a time relative to each other. Otherwise, the status displays unexpected messages about the same operator managing resources in another test's namespace. ref: https://issues.redhat.com/browse/ACM-26529 Signed-off-by: Janelle Law <[email protected]>
Prevent test flakes when cluster-scoped operator installs at the same time as a duplicate namespaced-scoped operator in another test. ref: https://issues.redhat.com/browse/ACM-26529 Signed-off-by: Janelle Law <[email protected]>
Move tests that delete and recreate an operator crd to the same location ref: https://issues.redhat.com/browse/ACM-26529 Signed-off-by: Janelle Law <[email protected]>
These tests delete and recreate the same CRD. They will need to run one at a time relative to each other. ref: https://issues.redhat.com/browse/ACM-26529 Signed-off-by: Janelle Law <[email protected]>
5adf9c1 to
692f37e
Compare
Replaces the operator under test with another operator exclusively used for the crd tests. Prevents conflicts in other tests when the operator's crd is deleted as part of the crd test setup. ref: https://issues.redhat.com/browse/ACM-26529 Signed-off-by: Janelle Law <[email protected]>
Match the same catalog source cleanup logic as "Testing an all default operator policy" ref: https://issues.redhat.com/browse/ACM-26529 Signed-off-by: Janelle Law <[email protected]>
ref: https://issues.redhat.com/browse/ACM-26529 Signed-off-by: Janelle Law <[email protected]>
Retry the patch commands to prevent a test flake timeout when the patch commands hung without retrying ref: https://issues.redhat.com/browse/ACM-26529 Signed-off-by: Janelle Law <[email protected]>
Skip Consistently() checks when previous spec triggers async OLM subscription updates that interfere with current spec ref: https://issues.redhat.com/browse/ACM-26529 Signed-off-by: Janelle Law <[email protected]>
This test sometimes flakes by counting 2 extra reconciles from the previous spec. Wait to resolve these reconciles before continuing ref: https://issues.redhat.com/browse/ACM-26529 Signed-off-by: Janelle Law <[email protected]>
Changed one operator's scope to a single namespace and increased the timeouts for installation to complete. ref: https://issues.redhat.com/browse/ACM-26529 Signed-off-by: Janelle Law <[email protected]>
Notes
Aiming to reduce total e2e test time by 10min without adding more flakes
refactor tests ref: https://issues.redhat.com/browse/ACM-26529
test flake ref: https://issues.redhat.com/browse/ACM-23597
Warning
The commits "test: fix flaky reconcile counter" and "test: run crd tests with a different operator" reduce but do not eliminate these two test flakes. Any suggestions would be appreciated
Summary of changes
Ordereddecorator. Tests will run in parallel test processes unless marked withSerialDescribecontainer with its own parent policy and namespace suffixed with a hash of the test nameutils.KubectlJSONPatchToFileto update the namespaces and policy name suffixes before creating the resourcesexample-operatorwithAllNamespacesmode into anOrderedDescribecontainerOrderedDescribecontainer. Replaced the operator in these tests with a different operator and CRD than the other testsSerial: 2 tests that break the catalog source (including ACM-23597), and 1 test that counts controller reconcilesAssisted by Claude 4.5