Skip to content

[BUG] RFS backfill not considering shard routing from source cluster #1162

@kanatti

Description

@kanatti

What is the bug?

In Elasticsearch and Opensearch, when you index a document, you can specify an optional routing value, which routes the document to a specific shard. And instead of having to fan out a search request to all the shards in an index, the request can be sent to just the shard that matches the specific routing value,

Current RFS doesn't considering shard routing from source cluster. As a result the target indices are backward incompatible if source index was build with _routing and if search queries rely in routing .

How can one reproduce the bug?

  1. Index a few documents in source with routing.
  2. Do a search with explain=true to show _routing metadata per doc
  3. Migrate to target with RFS
  4. Do a search with explain=true on target OS. _routing wont be set. You will also notice that docs within a single shard have been shuffled now on target OS.

What is the expected behavior?

RFS use routing from source while bulk indexing into target.

What is your host/environment?

N/A

Do you have any screenshots?

No

Do you have any additional context?

Suggested solution:
_routing field should be available within lucene document and can be set per document while calling bulk API.

I have a working fix, I will raise a pull request soon.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions