Skip to content

KAFKA-15556: Remove NetworkClientDelegate methods isUnavailable, maybeThrowAuthFailure, and tryConnect #15020

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: trunk
Choose a base branch
from

Conversation

Phuc-Hong-Tran
Copy link
Contributor

Change:

*Refactored AbstractFetch so only Fetcher will check for node status when creating fetch requests.

Committer Checklist (excluded from commit message)

  • Verify design and implementation
  • Verify test coverage and CI build status
  • Verify documentation (including upgrade notes)

@Phuc-Hong-Tran
Copy link
Contributor Author

Phuc-Hong-Tran commented Dec 15, 2023

@philipnee, I closed the other PR since closing and opening it again doesn't trigger the CI to start. Here is the context for this PR: #14406 (comment)

@Phuc-Hong-Tran
Copy link
Contributor Author

@philipnee, all builds succeed, test failures looks unrelated. PTAL, thanks.

@Phuc-Hong-Tran
Copy link
Contributor Author

@philipnee @kirktrue, Please take a look if you guys have some free time. Thanks in advance.

Copy link
Contributor

@philipnee philipnee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @Phuc-Hong-Tran - thanks for taking time raising this PR. I apologize for not writing the the jira ticket well here, so I recommend to rethink how you would approach this refactor. In short - we don't need to handle isUnavailable and auth failure in the request manager, so it is ok to make the default implementation to just return false, and perform no ops for both methods. The reason is networkClientDelegate already handle these scenario, just in a slight different order to the LegacyKafkaConsumer.

@@ -372,7 +372,7 @@ Node selectReadReplica(final TopicPartition partition, final Node leaderReplica,
}
}

protected Map<Node, FetchSessionHandler.FetchRequestData> prepareCloseFetchSessionRequests() {
protected Map<Node, FetchSessionHandler.FetchRequestData> prepareCloseFetchSessionRequests(boolean checkNodeAvailability) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

final boolean checkNodeAvailability

@@ -385,7 +385,9 @@ protected Map<Node, FetchSessionHandler.FetchRequestData> prepareCloseFetchSessi
// skip sending the close request.
final Node fetchTarget = cluster.nodeById(fetchTargetNodeId);

if (fetchTarget == null || isUnavailable(fetchTarget)) {
boolean fetchTargetAvailability = checkNodeAvailability ? (fetchTarget == null || isUnavailable(fetchTarget)) : fetchTarget == null;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

final boolean

  • i would also call this isFetchTargetAvailable
  • let's not use inline if. this makes the code harder to read.

log.trace("Skipping fetch for partition {} because node {} is awaiting reconnect backoff", partition, node);
} else if (nodesWithPendingFetchRequests.contains(node.id())) {
log.trace("Skipping fetch for partition {} because previous request to {} has not been processed", partition, node);
if (checkNodeAvailability) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it possible to avoid nested if?

return pollInternal(
prepareFetchRequests(),
prepareFetchRequests(checkNodeAvailability),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just pass false. the var above is useless.

return pollInternal(
prepareCloseFetchSessionRequests(),
prepareCloseFetchSessionRequests(checkNodeAvailability),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

@@ -102,7 +102,9 @@ public void clearBufferedDataForUnassignedPartitions(Collection<TopicPartition>
* @return number of fetches sent
*/
public synchronized int sendFetches() {
final Map<Node, FetchSessionHandler.FetchRequestData> fetchRequests = prepareFetchRequests();
boolean checkNodeAvailability = true;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as above

@@ -781,10 +781,10 @@ public void testFetchSkipsBlackedOutNodes() {
Node node = initialUpdateResponse.brokers().iterator().next();

client.backoff(node, 500);
assertEquals(0, sendFetches());
assertEquals(1, sendFetches());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why are you changing the tests?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we're not checking the node availablity using the networkClientDelegate in FetchRequestManager, the request that was made would pass through the check and made it way to unsentRequest. After the backoff the request should be sent already.

Should I not change the test and make new one instead?

@@ -403,7 +405,7 @@ protected Map<Node, FetchSessionHandler.FetchRequestData> prepareCloseFetchSessi
* Create fetch requests for all nodes for which we have assigned partitions
* that have no existing requests in flight.
*/
protected Map<Node, FetchSessionHandler.FetchRequestData> prepareFetchRequests() {
protected Map<Node, FetchSessionHandler.FetchRequestData> prepareFetchRequests(boolean checkNodeAvailability) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

final boolean

@Phuc-Hong-Tran
Copy link
Contributor Author

Phuc-Hong-Tran commented Jan 3, 2024

@philipnee, Thanks for the comments. When you said I should rethink about the approach, are you suggesting that I should start over from the idea phase?

@philipnee
Copy link
Contributor

hi @Phuc-Hong-Tran - yes, the abstractFetch implementation is based on the LegacyKafkaConsumer and therefore requires connection probing. We don't need that in the AsyncKafkaConsumer as it is being done right before sending out the requests.

@Phuc-Hong-Tran
Copy link
Contributor Author

I understand, will get another PR out ASAP.

@Phuc-Hong-Tran
Copy link
Contributor Author

@philipnee, PTAL, thanks in advance.

@Phuc-Hong-Tran
Copy link
Contributor Author

@philipnee, is there anything else that I should change?

@philipnee
Copy link
Contributor

hi @Phuc-Hong-Tran - thanks for the PR. Could you also clean up the isUnavailable and maybeThrowAuthFailure methods in the networkClientDelegate. I believe they aren't being used anywhere.

@Phuc-Hong-Tran
Copy link
Contributor Author

@philipnee Will do

@Phuc-Hong-Tran
Copy link
Contributor Author

@philipnee All done

@kirktrue
Copy link
Contributor

@Phuc-Hong-Tran—is this PR ready for review?

@Phuc-Hong-Tran
Copy link
Contributor Author

@kirktrue, it's ready for review

Copy link

This PR is being marked as stale since it has not had any activity in 90 days. If you would like to keep this PR alive, please ask a committer for review. If the PR has merge conflicts, please update it with the latest from trunk (or appropriate release branch)

If this PR is no longer valid or desired, please feel free to close it. If no activity occurs in the next 30 days, it will be automatically closed.

@github-actions github-actions bot added the stale Stale PRs label Jun 25, 2024
@mjsax mjsax added the consumer label Jul 22, 2024
@philipnee philipnee added the KIP-848 The Next Generation of the Consumer Rebalance Protocol label Jul 23, 2024
* Node's availability will be checked at a later stage, so we default to return false.
* @param node {@link Node} to check for availability
* @return
*/
@Override
protected boolean isUnavailable(Node node) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does seem a bit like half a refactor. Would it be possible to remove isUnavailable and maybeThrowAuthFailure entirely from AbstractFetch?

@github-actions github-actions bot removed the stale Stale PRs label Jan 5, 2025
Copy link

github-actions bot commented Apr 5, 2025

This PR is being marked as stale since it has not had any activity in 90 days. If you
would like to keep this PR alive, please leave a comment asking for a review. If the PR has
merge conflicts, update it with the latest from the base branch.

If you are having difficulty finding a reviewer, please reach out on the [mailing list](https://kafka.apache.org/contact).

If this PR is no longer valid or desired, please feel free to close it. If no activity occurs in the next 30 days, it will be automatically closed.

@github-actions github-actions bot added the stale Stale PRs label Apr 5, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clients consumer KIP-848 The Next Generation of the Consumer Rebalance Protocol stale Stale PRs
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants