-
Notifications
You must be signed in to change notification settings - Fork 113
CP/DP Update non-functional tests #3305
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: change/control-data-plane-split
Are you sure you want to change the base?
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## change/control-data-plane-split #3305 +/- ##
===================================================================
- Coverage 89.74% 86.49% -3.26%
===================================================================
Files 109 125 +16
Lines 11150 14453 +3303
Branches 50 62 +12
===================================================================
+ Hits 10007 12501 +2494
- Misses 1083 1813 +730
- Partials 60 139 +79 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
tests/suite/scale_test.go
Outdated
ctx, cancel := context.WithTimeout(context.Background(), timeoutConfig.UpdateTimeout) | ||
|
||
time.Sleep(terminationTime) | ||
Expect(resourceManager.WaitForGatewayObservedGeneration(ctx, ns.Name, "gateway", currentGen)).To(Succeed()) | ||
|
||
cancel() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed this check because I was running into errors where this wouldn't pass. When I tested things out, it would fail when the current NGF leader gets scaled down. My hypothesis is that this has to do with leader election and the agent connecting to the non-leader.
When the current leader (who the agent is connected to also) scales down and another NGF pod becomes leader, agent will connect to a random NGF pod. And thus the Gateway object won't have its statuses updated. @sjberman Am I remembering this correctly? Or should the Gateway object still be able to have its statuses updated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The expectation of this test needs to change slightly. The idea was that we send traffic to nginx, then scale it down and there should be minimal disruption to traffic. So we really should be scaling down nginx, not the control plane.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
Will post a picture of the updated longevity dashboard soon. |
tests/Makefile
Outdated
@@ -44,7 +44,7 @@ build-test-runner-image: ## Build conformance test runner image | |||
|
|||
.PHONY: build-crossplane-image | |||
build-crossplane-image: ## Build the crossplane image | |||
docker build --build-arg NGINX_CONF_DIR=$(NGINX_CONF_DIR) -t nginx-crossplane:latest -f framework/crossplane/Dockerfile .. | |||
docker build --platform linux/$(GOARCH) --build-arg NGINX_CONF_DIR=$(NGINX_CONF_DIR) -t nginx-crossplane:latest -f framework/crossplane/Dockerfile .. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If running functional tests locally on a mac, won't this fail? Try replacing linux
with GOOS
and see if that works for both local and GKE VM tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changed it and verified it works for local and gke tests
tests/suite/scale_test.go
Outdated
ctx, cancel := context.WithTimeout(context.Background(), timeoutConfig.UpdateTimeout) | ||
|
||
time.Sleep(terminationTime) | ||
Expect(resourceManager.WaitForGatewayObservedGeneration(ctx, ns.Name, "gateway", currentGen)).To(Succeed()) | ||
|
||
cancel() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The expectation of this test needs to change slightly. The idea was that we send traffic to nginx, then scale it down and there should be minimal disruption to traffic. So we really should be scaling down nginx, not the control plane.
Created #3311 to track enabling upgrade nfr test for 2.1 release. |
Here is what the longevity dashboard looks like now, notably we are missing all of the nginx specific metrics. We now only have access the container cpu and memory. Added an additional possible acceptance criteria to #1744 which describes why and would we could do to recover more graphs. |
Proposed changes
Update non-functional tests for the control plane data plane split.
Problem: The non-functional tests do not work for the control plane data plane split changes.
Solution: Update non-functional tests.
Testing: Scale, Reconfiguration, Performance, and Longevity tests work. Upgrade test doesn't work, however that is sort of planned since the CP/DP split is a breaking change of NGF and thus you can't easily upgrade with zero downtime.
Closes #3010
#2374
Checklist
Before creating a PR, run through this checklist and mark each as complete.
Release notes
If this PR introduces a change that affects users and needs to be mentioned in the release notes,
please add a brief note that summarizes the change.