Skip to content

KAFKA-1826 [1/N]: Introducing GroupStore #17981

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 5 commits into
base: trunk
Choose a base branch
from

Conversation

aliehsaeedii
Copy link
Contributor

@aliehsaeedii aliehsaeedii commented Nov 28, 2024

This PR and the future following ones aim to refactor the big GroupMetadataManager class and split it into multiple classes. This PR specifically introduces the GroupStore class, which contains the metadata for all groups.

The high level design: mainly introducing the following classes, each containing specific properties and methods:

  1. GroupStore: metadata for all groups.
  2. ShareGroupMetadataManager: has a reference to GroupStore + properties and methods related to ShareGroup metadata management.
  3. ConsumerGroupMetadataManager: has a reference to GroupStore + properties and methods related to ConsumerGroup metadata management.
  4. ClassicGroupMetadataManager: has a reference to GroupStore + properties and methods related to ClassicGroup metadata management.
  5. StreamsGroupMetadataManager: has a reference to GroupStore + properties and methods related to StreamsGroup metadata management.

Since ConsumerGroupMetadataManager and ClassicGroupMetadataManager share many methods, a helper class may be defined to avoid method duplication.

@lucasbru
Copy link
Member

Hi @aliehsaeedii. Thanks for the PR.

A few high-level comments for now:

  • Could we already make use of GroupStore, i.e. remove the corresponding functionality from GroupMetadataManager ?
  • The PR title shouldn't include KIP-1071, since this is not necessarily related to the KIP
  • Could you add a PR description?

cc @dajac

@aliehsaeedii aliehsaeedii changed the title KIP-1071: GroupStore KAFKA-1826 [1/N]: Introducing GroupStore Nov 29, 2024
import static org.apache.kafka.coordinator.group.Group.GroupType.CONSUMER;
import static org.apache.kafka.coordinator.group.Group.GroupType.SHARE;

public class GroupStore {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GroupStore could be renamed to GroupMetadataStore.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I like the proposal.

@dajac
Copy link
Member

dajac commented Nov 29, 2024

Thanks @aliehsaeedii! I will definitely review it next week. Would it be possible to describe the high level design that you're aiming for? I would like to ensure that we are on the same page. It will also ease the reviews.

@aliehsaeedii
Copy link
Contributor Author

Thanks, @dajac. I updated the PR description with a brief description of the high level design. This closed PR, contains all the changes (without utests), but we decided to create a separate PR for each phase.

@aliehsaeedii
Copy link
Contributor Author

  • Could we already make use of GroupStore, i.e. remove the corresponding functionality from GroupMetadataManager

Thanks, @lucasbru. To remove the current GroupStore functionalities from the GroupMetadataManager, significant changes are required in several other classes, including OffsetMetadataManager and GroupCoordinatorShard. Additionally, their corresponding unit tests (and possibly integration tests?) will need to be updated. Given that these changes are not final and may be altered by subsequent pull requests, it does not seem practical to modify these classes at this stage. WDYT?

@lucasbru
Copy link
Member

lucasbru commented Dec 2, 2024

  • Could we already make use of GroupStore, i.e. remove the corresponding functionality from GroupMetadataManager

Thanks, @lucasbru. To remove the current GroupStore functionalities from the GroupMetadataManager, significant changes are required in several other classes, including OffsetMetadataManager and GroupCoordinatorShard. Additionally, their corresponding unit tests (and possibly integration tests?) will need to be updated. Given that these changes are not final and may be altered by subsequent pull requests, it does not seem practical to modify these classes at this stage. WDYT?

Okay, I see. The point of replacing the functionality immediately would be that it is easier to track that all unit tests are being ported from the old to the new structure, since the unit tests are the only thing that stop us from breaking things here. But if you think it's too complicated, from my side it's okay to do it this way. Please take extra care retainingthe test coverage. I will give this a more detailed look tomorrow, but on a high level the approach looks good to me.

@aliehsaeedii
Copy link
Contributor Author

  • Could we already make use of GroupStore, i.e. remove the corresponding functionality from GroupMetadataManager

Thanks, @lucasbru. To remove the current GroupStore functionalities from the GroupMetadataManager, significant changes are required in several other classes, including OffsetMetadataManager and GroupCoordinatorShard. Additionally, their corresponding unit tests (and possibly integration tests?) will need to be updated. Given that these changes are not final and may be altered by subsequent pull requests, it does not seem practical to modify these classes at this stage. WDYT?

Okay, I see. The point of replacing the functionality immediately would be that it is easier to track that all unit tests are being ported from the old to the new structure, since the unit tests are the only thing that stop us from breaking things here. But if you think it's too complicated, from my side it's okay to do it this way. Please take extra care retainingthe test coverage. I will give this a more detailed look tomorrow, but on a high level the approach looks good to me.

Thanks, @lucasbru. I see your point. I can remove the unit tests that are covered by the newly introduced classes from the GroupMetadataManagerTest. In our last PR, this test class must be empty (removed). Is that OK?

@@ -0,0 +1,307 @@
/*
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@aliehsaeedii Thanks for working on this! I have a few high level comments regarding the GroupStore. I want us to agree on what we put and don't put in the store.

  • In my mind, the store should hold the state of all the groups. I think that we all agree on this.
  • As it holds the state of all groups, I am also tempted by moving all the replay methods to the store as they are the ones updating the state. Is it something that you have considered?
  • I think that the store should only have methods to query the state (e.g. get a group, list all groups, etc.). All the other methods (e.g. validateDeleteGroup, maybeDeleteGroup, createGroupTombstoneRecords, etc should not be here in my opinion.
  • The API specific methods should resides outside of the store. e.g. listGroups is directly linked to the implementation of the admin API, hence it may be better to have it in one of the other managers or in a manager responsible for the admin APIs.
  • The store does not need a reference to a MetadataImage.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, @dajac, for the conceptual review.

I am also tempted by moving all the replay methods to the store

Agree! That makes sense and makes the utests cleaner.

other methods (e.g. validateDeleteGroup, maybeDeleteGroup, createGroupTombstoneRecords, etc should not be here in my opinion.

Your opinion is valid David. Since other manager classes have a reference to GroupStore, I assumed such methods could be inside this class to be accessible by all other manager classes. But I think, later we can have a helper class and these methods could be defined there.

The API specific methods should resides outside of the store. e.g. listGroups 

listGroups: REMOVED

The store does not need a reference to a MetadataImage.

I removed the MetadataImage and the associated methods from this class. Do you mean each manager class should have its own MetadataImage instance?

Copy link
Member

@lucasbru lucasbru left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left some comments on the production code.

For the unit tests, I'm not sure, I like that we copy a lot of replay methods in the test code that are unrelated to the production code of the class. I think we have two options for solving this

  • We essentially make the group store completely independent of the group types. That would mean, in the unit tests we create a mock group type to test the functionality of the class independently of the group types.
  • We make the replay methods parts of the production code of GroupStore.

/**
* The snapshot registry.
*/
private SnapshotRegistry snapshotRegistry;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could be final

import static org.apache.kafka.coordinator.group.Group.GroupType.CONSUMER;
import static org.apache.kafka.coordinator.group.Group.GroupType.SHARE;

public class GroupStore {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I like the proposal.

Copy link

github-actions bot commented Mar 4, 2025

This PR is being marked as stale since it has not had any activity in 90 days. If you
would like to keep this PR alive, please leave a comment asking for a review. If the PR has
merge conflicts, update it with the latest from the base branch.

If you are having difficulty finding a reviewer, please reach out on the [mailing list](https://kafka.apache.org/contact).

If this PR is no longer valid or desired, please feel free to close it. If no activity occurs in the next 30 days, it will be automatically closed.

@github-actions github-actions bot added the stale Stale PRs label Mar 4, 2025
Copy link

github-actions bot commented Apr 3, 2025

This PR has been closed since it has not had any activity in 120 days. If you feel like this
was a mistake, or you would like to continue working on it, please feel free to re-open the
PR and ask for a review.

@github-actions github-actions bot added the closed-stale PRs that were closed due to inactivity label Apr 3, 2025
@github-actions github-actions bot closed this Apr 3, 2025
@lucasbru lucasbru reopened this Apr 3, 2025
@github-actions github-actions bot added triage PRs from the community and removed triage PRs from the community closed-stale PRs that were closed due to inactivity stale Stale PRs labels Apr 3, 2025
Copy link
Contributor

@jeffkbkim jeffkbkim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for the PR. left some comments

import static org.apache.kafka.coordinator.group.classic.ClassicGroupState.EMPTY;
import static org.apache.kafka.coordinator.group.classic.ClassicGroupState.STABLE;

public class GroupMetadataStore {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it would be great to add the high level idea of the group metadata store here

*/
private final TimelineHashMap<String, TimelineHashSet<String>> groupsByTopics;

public GroupMetadataStore(SnapshotRegistry snapshotRegistry, GroupCoordinatorMetricsShard metrics, Time time) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GMM uses a builder pattern. should we use it here as well?

}

/**
* Returns the snapshot registry.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: this is redundant

*
* @return The snapshot registry.
*/
public SnapshotRegistry snapshotRegistry() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what will be the use case for this method?

}

@ParameterizedTest
@ValueSource(booleans = {true, false})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

perhaps we pass in the rebalance times here instead?

throw new GroupIdNotFoundException(String.format("Group %s is not a consumer group", groupId));
}
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: newline

@mjsax mjsax added the streams label Apr 30, 2025
@mjsax mjsax added KIP-848 The Next Generation of the Consumer Rebalance Protocol KIP-1071 PRs related to KIP-1071 labels Apr 30, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ci-approved KIP-848 The Next Generation of the Consumer Rebalance Protocol KIP-1071 PRs related to KIP-1071 streams
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants