Skip to content

Calculate routing num shards correctly during reshard #125601

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 16 commits into
base: main
Choose a base branch
from

Conversation

ankikuma
Copy link
Contributor

No description provided.

@elasticsearchmachine elasticsearchmachine added v9.1.0 serverless-linked Added by automation, don't add manually labels Mar 25, 2025
@ankikuma ankikuma marked this pull request as ready for review April 8, 2025 15:43
@elasticsearchmachine elasticsearchmachine added the needs:triage Requires assignment of a team area label label Apr 8, 2025
public Builder reshardAddShards(int shardCount) {
// Assert routingNumShards is null ?
// Assert numberOfShards > 0
public Builder reshardAddShards(int shardCount, IndexMetadata sourceMetadata) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changing the API this way makes it possible to accidentally supply the wrong metadata to the method, and we know we've already provided the right metadata in the builder's constructor.

I think the reason you're passing it in is that getIndexNumberOfRoutingShards wants a metadata object but we've already decomposed it into parts in the constructor. I think it's probably fine to just hold on to a reference to the whole thing for the life of the builder (i.e., this.indexMetadata = indexMetadata in the constructor or something), or to change the interface to getIndexNumberOfRoutingShards, which only has a handful of users that mostly pass in null. Maybe the first option is the simplest?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, I didn't like passing in the sourceMetadata either. But I don't know if I want to hold onto a reference to the whole thing because it looks like we would end up with some kind of recursion in toXContent() wouldn't we ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, I hadn't noticed that when IndexMetadata goes over the wire it passes through the Builder interface.

Looking at MetadataCreateIndexService::getIndexNumberOfRoutingShards the only thing it actually uses sourceMetadata for is to call getNumberOfShards on it if it exists. So one option to avoid passing in this essentially redundant sourceMetadata field would be to refactor getIndexNumberOfRoutingShards a bit to have an inner method that just takes routingNumShards or 0 if metadata is null, which you could call directly from here, and then make the existing getIndexNumberOfRoutingShards(Settings indexSettings, @Nullable IndexMetadata sourceMetadata) just be something like return getIndexNumberOfRoutingShards(settings, sourceMetadata == null ? 0 : sourceMetadata.getRoutingNumShards()

@ankikuma ankikuma added the Team:Distributed Indexing Meta label for Distributed Indexing team label Apr 16, 2025
@elasticsearchmachine elasticsearchmachine removed the Team:Distributed Indexing Meta label for Distributed Indexing team label Apr 16, 2025
@ankikuma ankikuma added the Team:Distributed Indexing Meta label for Distributed Indexing team label Apr 16, 2025
@elasticsearchmachine elasticsearchmachine removed the Team:Distributed Indexing Meta label for Distributed Indexing team label Apr 16, 2025
@ankikuma ankikuma added the :Distributed Indexing/Distributed A catch all label for anything in the Distributed Indexing Area. Please avoid if you can. label Apr 18, 2025
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-distributed-indexing (Team:Distributed Indexing)

@elasticsearchmachine elasticsearchmachine added Team:Distributed Indexing Meta label for Distributed Indexing team and removed needs:triage Requires assignment of a team area label labels Apr 18, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Distributed Indexing/Distributed A catch all label for anything in the Distributed Indexing Area. Please avoid if you can. serverless-linked Added by automation, don't add manually Team:Distributed Indexing Meta label for Distributed Indexing team v9.1.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants