Skip to content

[fix][client] Fix ArrayIndexOutOfBoundsException when using SameAuthParamsLookupAutoClusterFailover #23336

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -71,29 +71,36 @@ public void initialize(PulsarClient client) {
this.executor = EventLoopUtil.newEventLoopGroup(1, false,
new ExecutorProvider.ExtendedThreadFactory("broker-service-url-check"));
scheduledCheckTask = executor.scheduleAtFixedRate(() -> {
if (closed) {
return;
}
checkPulsarServices();
int firstHealthyPulsarService = firstHealthyPulsarService();
if (firstHealthyPulsarService == currentPulsarServiceIndex) {
return;
}
if (firstHealthyPulsarService < 0) {
int failoverTo = findFailoverTo();
if (failoverTo < 0) {
// No healthy pulsar service to connect.
log.error("Failed to choose a pulsar service to connect, no one pulsar service is healthy. Current"
+ " pulsar service: [{}] {}. States: {}, Counters: {}", currentPulsarServiceIndex,
pulsarServiceUrlArray[currentPulsarServiceIndex], Arrays.toString(pulsarServiceStateArray),
Arrays.toString(checkCounterArray));
try {
if (closed) {
return;
}
checkPulsarServices();
int firstHealthyPulsarService = firstHealthyPulsarService();
if (firstHealthyPulsarService == currentPulsarServiceIndex) {
return;
}
if (firstHealthyPulsarService < 0) {
int failoverTo = findFailoverTo();
if (failoverTo < 0) {
// No healthy pulsar service to connect.
log.error(
"Failed to choose a pulsar service to connect, no one pulsar service is healthy."
+ " Current pulsar service: [{}] {}. States: {}, Counters: {}",
currentPulsarServiceIndex,
pulsarServiceUrlArray[currentPulsarServiceIndex],
Arrays.toString(pulsarServiceStateArray),
Arrays.toString(checkCounterArray));
} else {
// Failover to low priority pulsar service.
updateServiceUrl(failoverTo);
}
} else {
// Failover to low priority pulsar service.
updateServiceUrl(failoverTo);
// Back to high priority pulsar service.
updateServiceUrl(firstHealthyPulsarService);
}
} else {
// Back to high priority pulsar service.
updateServiceUrl(firstHealthyPulsarService);
} catch (Exception ex) {
log.error("Failed to re-check cluster status", ex);
}
}, checkHealthyIntervalMs, checkHealthyIntervalMs, TimeUnit.MILLISECONDS);
}
Expand Down Expand Up @@ -123,7 +130,7 @@ private int firstHealthyPulsarService() {
}

private int findFailoverTo() {
for (int i = currentPulsarServiceIndex + 1; i <= pulsarServiceUrlArray.length; i++) {
for (int i = currentPulsarServiceIndex + 1; i < pulsarServiceUrlArray.length; i++) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just wondering if this logic is correct in the first place. Shouldn't this wrap around the array?

the for loop in checkPulsarServices looks strange too.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The logic of fallover is follows

  • if the current cluster is the highest priority cluster and it is healthy, do nothing.
  • else if there is a healthy and high priority cluster, recover to the higher priority one.
  • else if both current cluster and higher priority clusters are not healthy, find a healthy cluster to fallover to.

So the two method checkPulsarServices and findFailoverTo are for the different cases to same CPU resouces.

  • checkPulsarServices is for checking whether their is a healthy and high priority cluster.
  • findFailoverTo is for checking whether their has a cluster is healthy.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@poorbarcode what happens in the case where currentPulsarServiceIndex is already the last one? How can it find a failover cluster in that case?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what happens in the case where currentPulsarServiceIndex is already the last one? How can it find a failover cluster in that case?

It will transfer to the first healthy cluster. Once encounters the ArrayIndexOutOfBoundsException error, it means no cluster is healthy.

The issue the correct PR solved affects nothing

if (probeAvailable(i)) {
return i;
}
Expand Down
Loading