-
Notifications
You must be signed in to change notification settings - Fork 40
Refactored snapshot handling to make it easier to add new Lucene sources #1479
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…cument retrieval to make it easier to add new Lucene document sources. I'm also in the process of adding new parameters - 1. There are parameters for the coordination cluster, of the same type as what we use for source and target, so that the target cluster doesn't need to be used for coordination. A wholly independent cluster can be setup just for coordination. There's a patch on the argo branch to support flexible index names too, so one could imagine running many deployments with the same coordination index. 2. snapshotLoader* is in place, though only partially implemented, to allow alternate ways to get shard info and create document streams. I intend to load the specified class in via ServiceLoader, similar to how Transformations work today. 3. initialSubdivision hasn't been implemented yet, but I think it would be pretty easy and super useful, especially for the case where users want to reshard an index. Rather than use a single worker at a time for a source shard, this would pull the shard down to find the number of documents, then split the work up into that many shards so that the migration could be completed by that factor faster (since there wouldn't be idling target shards). This should be easy to implement by adding an end document index so that we can split a shard in parallel. It would also be easy to detect the first one since the initial work item would have a -1 for its end index. Everything else was just shuffling old code around into constructors and the newly created ElasticsearchSnapshotDocumentRepository, which wraps up the S3-centric objects and metadata that RFS had since been using beforehand. The EncryptedShardBucketCollectionDocumentRepository also has stub code that won't do anything. Since we don't have a spec for reading encrypted files yet, this will be hived off, to be provided via a hot-loaded jar at some point in the future. Signed-off-by: Greg Schohn <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
High level, this looks good, thanks for cleaning up some of the snapshot reading interface.
@Parameter(required = true, | ||
@Parameter(required = false, | ||
names = {"--snapshotLoader"}, | ||
description = "Hot-loaded class that should be used to download the lucene directory of a shard") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
'Hot loading' is a bit of a misnomer, terrified me a little bit when first looking at the code as it implies a class loader, glad to see that isn't the case. IMO is a good candidate for a future expansion as need.
|
||
@Parameter(required = false, | ||
names = { "--initial-subdivision", "--initialSubdivision" }, | ||
converter = VersionConverter.class, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That doesn't seem correct
import org.opensearch.migrations.cluster.ClusterProviderRegistry; | ||
import org.opensearch.migrations.cluster.ClusterSnapshotReader; | ||
|
||
public class ElasticsearchSnapshotDocumentRepository extends LuceneBasedDocumentRepository { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: We seem to be using SearchCluster for ES or OS
public class ElasticsearchSnapshotDocumentRepository extends LuceneBasedDocumentRepository { | |
public class SearchClusterSnapshotDocumentRepository extends LuceneBasedDocumentRepository { |
|
||
@Override | ||
public LuceneIndexReader getReader(String index, int shard) { | ||
return new IndexReader9(downloadManager.downloadToLocalPath(index, shard), false, null); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Before merging lets rationalize this with the version matrix used in LuceneIndexReader
} | ||
|
||
@Getter | ||
public static class CoordinatorArgs implements ConnectionContext.IParams { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks identical to the existing target args, what is the use case for this over the target?
…o a process-oriented one. In this case, a command and some config parameter is passed to the Process.start(). As the code is, that command will be called with different conventions, with different command arguments, as well as additional arguments to that command (ie. indexName or indexName and shard) to fulfill the 4 different operations that the LuceneBasedDocumentRepository needs. I've not built out a program that fulfills this contract yet, so an end-to-end test through this is still pending. Signed-off-by: Greg Schohn <[email protected]>
import lombok.extern.slf4j.Slf4j; | ||
|
||
@Slf4j | ||
public class ManuallyManagedLuceneShardRepository extends LuceneBasedDocumentRepository { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure what to make of this class, it seems like it belongs in its own top level directory as it seems to be an independent tool. What do you think of adding an explanation / data flow diagram of how this will be used and in what case in conjunction with the Metadata/RFS workflow?
|
||
@Override | ||
public List<Index> getIndicesInSnapshot(String snapshotName) { | ||
public List<Index> getIndicesInSnapshot() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Really appreciate this refactor - for your consideration breakout the 'fix' for the snapshot name into its own PR so we could fast-track merging of that change and isolate it from the other work that you are doing.
@gregschohn This has been open for more than a month, need any help with next steps or should this be closed out? |
I connected with @gregschohn, we aligned to close out this PR and focus on a different way to cut this in https://opensearch.atlassian.net/browse/MIGRATIONS-2564 |
Description
WIP - I've refactored snapshot handling throughout shard setup and document retrieval to make it easier to add new Lucene document sources.
I'm also in the process of adding new parameters -
Everything else was just shuffling old code around into constructors and the newly created ElasticsearchSnapshotDocumentRepository, which wraps up the S3-centric objects and metadata that RFS had since been using beforehand. The EncryptedShardBucketCollectionDocumentRepository also has stub code that won't do anything. Since we don't have a spec for reading encrypted files yet, this will be hived off, to be provided via a hot-loaded jar at some point in the future.
Issues Resolved
Testing
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.