Skip to content

[Bug]: Rebalance does not happen at all using AutoRebalance feature #11296

Open
@pnagy-cldr

Description

@pnagy-cldr

Bug Description

We test the AutoRebalance feature and in rare cases it does not do anything.
The issue is very similar to: #11195
The only difference is that the old issue talks about manual rebalance, and the current one is about the automatic one.

Steps to reproduce

  1. Create Kafka where autorebalance is configured for add/remove brokers:

spec:
cruiseControl:
autoRebalance:
- mode: add-brokers
- mode: remove-brokers

  1. Scale Kafka up a Kafka cluster.
  2. Strimzi will detect that add_broker should be called, and it calls CC
  3. CC is still in rolling restart phase, so in an unlucky situation the old CC gets the request and it does not know any capacity info about new broker, response will be 500:
    'com.linkedin.kafka.cruisecontrol.exception.KafkaCruiseControlException: java.lang.NullPointerException: Cannot invoke "com.linkedin.kafka.cruisecontrol.config.BrokerCapacityInfo.capacity()"
  4. The KafkaRebalance resource will be in NotReady state, updated by Strimzi after the error:
    ERROR KafkaRebalanceAssemblyOperator:403 - Reconciliation #1287(watch) KafkaRebalance(...):Status updated to [NotReady] due to error:...
  5. In case the KafkaRebalance is in NotReady state and the scalingNodes does not change in the meantime (So you do not scale further), your KafkaRebalance will simply be deleted by KafkaAutoRebalancingReconciler, and the rebalance will not happen.

Expected behavior

This issue is not permanent, as I mentioned in the steps it seems rare.
Should Strimzi retry this kind of issues? In automatic rebalance case, it is very weird that my rebalance will not happen ever because of a timing issue. Obviously I can create my rebalance manually but that is why I use the automatic one.

Strimzi version

0.45

Kubernetes version

1.28

Installation method

Helm chart

Infrastructure

k3s

Configuration files and logs

No response

Additional context

No response

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions