Skip to content

KUBE-417: Add a way to do load testing of CC #169

Merged
Tsonov merged 22 commits intomainfrom
KUBE-417-loadtesting-cc
Mar 20, 2025
Merged

KUBE-417: Add a way to do load testing of CC #169
Tsonov merged 22 commits intomainfrom
KUBE-417-loadtesting-cc

Conversation

@Tsonov
Copy link
Copy Markdown
Contributor

@Tsonov Tsonov commented Mar 14, 2025

Adds a "test server" mode to the CC binary.
It will not act as the real CC in this case but like a mock cluster-hub. The goal right now is to be able to write a test scenario and execute it in a real cluster and check the performance.

Under loadtest, the following is implemented:

  • Http server that acts as CH replacement
  • "Test server" where you can push actions from a scenario and it handles the actual logic to send action and track acks.
  • Under scenario the machinery to write scenarios + 2 scenarios.

Suggested reviewing order

  1. Start from loadtest -> http and castai
  2. Check scenarios/scenario.go and the sample scenarios
  3. Check how this is used under cmd/testserver

How to run this

  • Set your kubeconfig to point to the cluster you want to use.
  • Install kwok. Run kwok against the cluster in a terminal
    Sample command:
kwok  \ 
 --kubeconfig=~/.kube/config \
 --manage-all-nodes=false \
 --manage-nodes-with-annotation-selector=kwok.x-k8s.io/node=fake-node \
 --node-lease-duration-seconds=40 \
 --cidr=10.0.0.1/24 \
 --node-ip=10.0.0.1
  • Start the test server.
    Sample command:
API_KEY=dummy API_URL=http://example.com CLUSTER_ID=D30A163C-C5DF-4CC8-985C-D1449398295E KUBECONFIG=~/.kube/config LOG_LEVEL=4 go run . test-server 
  • Start Cluster controller
    Sample command:
 API_KEY=dummy API_URL=http://localhost:8080 CLUSTER_ID=D30A163C-C5DF-4CC8-985C-D1449398295E KUBECONFIG=~/.kube/config LOG_LEVEL=4 go run . 

This will run cluster controller locally against the test server and simulate 2000 CreateEvent actions and 100 DrainNode actions where a deployment prevents the drain.

Missing items, to be added in this PR:
[] Docs how to run it
[] Handle cleanup when sending SIGTERM to server (right now it does not run cleanup reliably)

Missing items, to be added in separate PR:

  • Expose some metrics from CC on completed/failed/success actions and maybe Go runtime metrics
  • Script + make command to start a local run or deploy in cluster run. General polish for using the server.
  • Scenarios for other actions

Some decisions explained:

What is the expected usage of this?

  • Identify scenario that you want to test and monitor CC under
  • Write scenario code
  • Run and validate assumptions (how much CPU/memory/goroutines does it use; does it crash; do we reach QPS limits; etc)

Why a new mode for the binary and not separate binary?
Easier to start for now, also doesn't need to build brand new binary/image and can just reuse.

Why expose a http server for the load testing?
Makes it easy to test deployed CC by deploying the "testserver" in cluster and changing the URL of CC.

Can this be used to run continuous load testing?
Yes, write a test that keeps pushing actions indefinitely. Might need some tweaks though.

Can this be used to run scheduled tests?
Yes but out of scope for current task.

@Tsonov Tsonov requested a review from a team as a code owner March 14, 2025 10:00
@Tsonov Tsonov force-pushed the KUBE-417-loadtesting-cc branch from c0f3fc5 to 80433c6 Compare March 20, 2025 08:15
@Tsonov Tsonov changed the title Draft: KUBE-417: Add a way to do load testing of CC KUBE-417: Add a way to do load testing of CC Mar 20, 2025
@Tsonov Tsonov merged commit 1e2798f into main Mar 20, 2025
4 checks passed
@Tsonov Tsonov deleted the KUBE-417-loadtesting-cc branch March 20, 2025 14:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants