Skip to content

Optimize the no. of refreshes when refresh=true during bulk, _update, _update_by_query, _delete_by_query, delete operations #15263

Open
@shwetathareja

Description

@shwetathareja

Is your feature request related to a problem? Please describe

OpenSearch has multiple APIs like bulk, _update, _update_by_query, _delete_by_query, delete etc. which allows refreshing the affected shards to make the operation visible for search. Refresh, in general is a costly operation and too many refreshes can result in creating too many small shards and then it adds the overhead of merging these shards. User can provide refresh=true/false/wait_until. False and wait_until will not cause any extra refreshes and refreshes will be triggered based on configured refresh_interval. But when it is set to refresh=true, every request will result in enqueuing a refresh, which can result in too many refreshes.

Describe the solution you'd like

Explore the option to either batch these refreshes or make a refresh request no-op if the local checkpoint which was refreshed last is higher than the sequence no. of the documents for which refresh was requested. It will reduce the overall no. of refreshes triggered when refresh=true is provided by the user explicitly.

Related component

Indexing:Performance

Describe alternatives you've considered

No response

Additional context

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions