Skip to content

[KAFKA-15580] First attempt at UncleanRecoveryManager #19468

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 2 commits into
base: trunk
Choose a base branch
from

Conversation

josefk31
Copy link
Contributor

Basic architecture for system to manage UncleanRecovery by fetching
log information from brokers.

The idea is to improve durability by electing the leader with longest during unclean recovery instead of just picking a random unfenced broker.

There will be one additional thread - RecoverySendThread which handles NIO between controller and broker - sending requests. LogInfoResponseReceived events will be written to the controllers queue. These events don't write anything until the last response is received - meaning that all needed information is collected to make accurate determinations of log length.

Responses to requests are processed in the controllers event queue by a RecoveryManager class.
It keeps track of the # of outstanding requests and decides whether and how to retry failed requests. RecoveryManager builds up a "LogInfoStore" object with information about log length of various replicas.

Either after all expected requests are received or a timeout; the RecoveryManager will begin to run leadership elections for batches of TopicPartitions (respecting the max batch size configuration). These elections will have access to the store which contains enriched log information to assist them in making leadership decisions.

Most of the tracking is performed in a class called "StateMachine" within RecoveryManager. This is deliberate so that eventually the RecoveryManager can supervise multiple unclean election requests at the same time.

  • Created GetReplicaLogInfoRequest/GetReplicaLogInfoResponse
  • Created RecoveryManager class to handle sending and receiving
    requests
  • Added configs to switch on unclean recovery
  • Added ability to elect with longest logs in PartitionChangeBuilder
  • Added RecoverySenderThread for sending

mannoopj and others added 2 commits April 14, 2025 12:57
add GetReplicaLogInfo Request and Response

add GetReplicaLogInfo to apikeys.java

add GetReplicaLogInfo Request and Response

fix GetReplicaLogInfoResponse

add test for GetReplicaLogInfo in RequestResponseTest.java

add GetReplicaLogInfoResponse

add latestVersionUnstable

Conflicts:
	clients/src/main/java/org/apache/kafka/common/protocol/ApiKeys.java
	clients/src/main/java/org/apache/kafka/common/requests/AbstractRequest.java
	clients/src/main/java/org/apache/kafka/common/requests/AbstractResponse.java
	clients/src/test/java/org/apache/kafka/common/requests/RequestResponseTest.java
	core/src/main/scala/kafka/network/RequestConvertToJson.scala

cherry-pick of PR on top of origin/trunk.
Basic architecture for system to manage UncleanRecovery by fetching
log information from brokers.

The idea is to improve durability by electing the leader with longest
during unclean recovery instead of just picking a random unfenced
broker.

There will be one additional thread - RecoverySendThread which handles
NIO between controller and broker - sending
requests. LogInfoResponseReceived events will be written to the
controllers queue. These events don't write anything until the last
response is received - meaning that all needed information is
collected to make accurate determinations of log length.

Responses to requests are processed in the controllers event queue by
a RecoveryManager class.
It keeps track of the # of outstanding requests and decides whether
and how to retry failed requests. RecoveryManager builds up a
"LogInfoStore" object with information about log length of various replicas.

Either after all expected requests are received or a timeout;
the RecoveryManager will begin to run leadership elections for batches
of TopicPartitions (respecting the max batch size
configuration). These elections will have access to the store which
contains enriched log information to assist them in making leadership decisions.

Most of the tracking is performed in a class called "StateMachine"
within RecoveryManager. This is deliberate so that eventually the
RecoveryManager can supervise multiple unclean election requests at the same
time.

- Created GetReplicaLogInfoRequest/GetReplicaLogInfoResponse
- Created RecoveryManager class to handle sending and receiving
requests
- Added configs to switch on unclean recovery
- Added ability to elect with longest logs in PartitionChangeBuilder
- Added RecoverySenderThread for sending
@github-actions github-actions bot added triage PRs from the community core Kafka Broker kraft storage Pull requests that target the storage module clients labels Apr 14, 2025
Copy link

A label of 'needs-attention' was automatically added to this PR in order to raise the
attention of the committers. Once this issue has been triaged, the triage label
should be removed to prevent this automation from happening again.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clients core Kafka Broker kraft needs-attention storage Pull requests that target the storage module triage PRs from the community
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants