Skip to content

Disable CC network resource goals when resource capacities are not set.#11465

Merged
scholzj merged 6 commits intostrimzi:mainfrom
kyguy:disable-goals
Jul 23, 2025
Merged

Disable CC network resource goals when resource capacities are not set.#11465
scholzj merged 6 commits intostrimzi:mainfrom
kyguy:disable-goals

Conversation

@kyguy
Copy link
Copy Markdown
Member

@kyguy kyguy commented May 22, 2025

Type of change

  • Enhancement / new feature

Description

This PR adds logic to the Strimzi Cluster Operator to remove network resource related Cruise Control goals from the default.goals and hard.goals lists by default if the capacity configurations for those resources are not explicitly set in the .spec.kafka.resources or .spec.cruiseControl.brokerCapacity sections of the Kafka custom resource. The following Cruise Control goals are affected:

com.linkedin.kafka.cruisecontrol.analyzer.goals.NetworkInboundCapacityGoal
com.linkedin.kafka.cruisecontrol.analyzer.goals.NetworkOutboundCapacityGoal
com.linkedin.kafka.cruisecontrol.analyzer.goals.NetworkInboundUsageDistributionGoal
com.linkedin.kafka.cruisecontrol.analyzer.goals.NetworkOutboundUsageDistributionGoal
com.linkedin.kafka.cruisecontrol.analyzer.goals.PotentialNwOutGoal

This will prevent users from balancing partitions based on network resources without defining capacities for those network resources in their Kafka custom resource.

Addresses the issue raised here: #11409

(EDITED): Updated to reduce scope to focus on network-related goals.

Checklist

Please go through this checklist and make sure all applicable tasks have been done

  • Write tests
  • Make sure all tests pass
  • Update documentation
  • Check RBAC rights for Kubernetes / OpenShift roles
  • Try your changes from Pod inside your Kubernetes and OpenShift cluster, not just locally
  • Reference relevant issue(s) and close them after merging
  • Update CHANGELOG.md
  • Supply screenshots for visual changes, such as Grafana dashboards

@kyguy kyguy added this to the 0.47.0 milestone May 22, 2025
@kyguy kyguy force-pushed the disable-goals branch 4 times, most recently from 68ccb06 to 00f270d Compare June 2, 2025 22:16
@kyguy kyguy force-pushed the disable-goals branch 7 times, most recently from 967c662 to 222e049 Compare June 11, 2025 12:38
@kyguy kyguy marked this pull request as ready for review June 11, 2025 13:17
@kyguy kyguy requested review from a team, ShubhamRwt, fvaleri and tomncooper June 11, 2025 14:09
Copy link
Copy Markdown
Member

@scholzj scholzj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I suggested in one of the comments. Resources are not configured in the Kafka CR anymore. It would be also good to understand what is the plan when some nodes have resources defined and some not. What will you do then?

@kyguy
Copy link
Copy Markdown
Member Author

kyguy commented Jun 12, 2025

It would be also good to understand what is the plan when some nodes have resources defined and some not. What will you do then?

Thanks for the catch (and review), we only want to consider the CPU capacity for a cluster to be considered to be properly configured for CPU-dependent Cruise Control goals if there is a capacity value explicitly configured for every broker whether through the general capacity or broker specific overrides, or the resource requirements. So if none of these are defined for a broker, including the resource requirement, we want to disable the CPU-dependent Cruise Control goals.

I had wrongly assumed that all the node pools had their resources defined in the kafkaBrokerResources map, I have updated the logic to disable the CPU-dependent Cruise Control goals if there is a broker in the kafkaBrokerNodes set that doesn't have it's node pool (and resource requirements) defined in the kafkaBrokerResources map.

Comment thread CHANGELOG.md Outdated
@kyguy kyguy changed the title Disable CC resource goals when resource capacities are not set. Disable CC network resource goals when resource capacities are not set. Jul 1, 2025
Copy link
Copy Markdown
Member

@scholzj scholzj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems to have quite a lot of changes given this now should be a pretty basic if-else thing. Should we split this into a refactoring PR and network goals PR? TBH, it is pretty hard to review what changes are really related to what.

You should probably also update the documentation to clarify the default goals?

@kyguy
Copy link
Copy Markdown
Member Author

kyguy commented Jul 3, 2025

Should we split this into a refactoring PR and network goals PR? TBH, it is pretty hard to review what changes are really related to what.

Yeah that makes sense, just isolated the refactoring work into a different PR here [1] Once that is sorted, this PR will be a lot easier to review.

[1] #11615

@scholzj scholzj modified the milestones: 0.47.0, 0.48.0 Jul 10, 2025
Signed-off-by: Kyle Liberti <kliberti.us@gmail.com>
Copy link
Copy Markdown
Member

@scholzj scholzj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just some quick comments.

Comment thread CHANGELOG.md Outdated
@kyguy kyguy force-pushed the disable-goals branch 3 times, most recently from 17bc8e9 to b67fe66 Compare July 18, 2025 18:24
Signed-off-by: Kyle Liberti <kliberti.us@gmail.com>
Copy link
Copy Markdown
Member

@scholzj scholzj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Few more nits, but mostly looking good.

Signed-off-by: Kyle Liberti <kliberti.us@gmail.com>
Copy link
Copy Markdown
Member

@scholzj scholzj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One more nit. LGTM otherwise. Thanks.

Signed-off-by: Kyle Liberti <kliberti.us@gmail.com>
@scholzj
Copy link
Copy Markdown
Member

scholzj commented Jul 21, 2025

/azp run regression

@scholzj scholzj requested a review from ppatierno July 21, 2025 18:33
@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

Copy link
Copy Markdown
Member

@ppatierno ppatierno left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Just a couple of nits.

Comment thread CHANGELOG.md Outdated
kyguy added 2 commits July 22, 2025 09:21
Signed-off-by: Kyle Liberti <kliberti.us@gmail.com>
Signed-off-by: Kyle Liberti <kliberti.us@gmail.com>
@scholzj
Copy link
Copy Markdown
Member

scholzj commented Jul 22, 2025

@tinaselenge Do you have anything more on this? You had some comments on this PR in the past. Thanks.

Copy link
Copy Markdown
Contributor

@tinaselenge tinaselenge left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the changes. LGTM!

@scholzj scholzj merged commit 016fd71 into strimzi:main Jul 23, 2025
13 checks passed
@scholzj scholzj added this to Roadmap Aug 23, 2025
@scholzj scholzj moved this to 0.48.0 (Work in Progress) in Roadmap Aug 23, 2025
see-quick pushed a commit to see-quick/strimzi-kafka-operator that referenced this pull request Sep 4, 2025
…t. (strimzi#11465)

Signed-off-by: Kyle Liberti <kliberti.us@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: 0.48.0

Development

Successfully merging this pull request may close these issues.

4 participants