-
Notifications
You must be signed in to change notification settings - Fork 25.3k
[WIP] Failure store - Lifecycle Management #125658
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closed
Closed
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
…ure-store/lifecycle
elasticsearchmachine
pushed a commit
that referenced
this pull request
Apr 23, 2025
The class `DataStreamLifecycle` is currently capturing the lifecycle configuration that currently manages all data stream indices, but soon enough it will be split into two variants, the data and the failures lifecycle. Some pre-work has been done already but as we are progressing in our POC, we see that it will be really useful if the `DataStreamLifecycle` is "aware" of the target index component. This will allow us to correctly apply global retention or to throw an error if a downsampling configuration is provided to a failure lifecycle. In this PR, we perform a small refactoring to reduce the noise in #125658. Here we introduce the following: - A factory method that creates a data lifecycle, for now it's trivial but it will be more useful soon. - We rename the "empty" builder to explicitly mention the index component it refers to.
gmarouli
added a commit
to gmarouli/elasticsearch
that referenced
this pull request
Apr 23, 2025
The class `DataStreamLifecycle` is currently capturing the lifecycle configuration that currently manages all data stream indices, but soon enough it will be split into two variants, the data and the failures lifecycle. Some pre-work has been done already but as we are progressing in our POC, we see that it will be really useful if the `DataStreamLifecycle` is "aware" of the target index component. This will allow us to correctly apply global retention or to throw an error if a downsampling configuration is provided to a failure lifecycle. In this PR, we perform a small refactoring to reduce the noise in elastic#125658. Here we introduce the following: - A factory method that creates a data lifecycle, for now it's trivial but it will be more useful soon. - We rename the "empty" builder to explicitly mention the index component it refers to. (cherry picked from commit b991708) # Conflicts: # modules/data-streams/src/test/java/org/elasticsearch/datastreams/lifecycle/DataStreamLifecycleServiceTests.java # server/src/test/java/org/elasticsearch/cluster/metadata/MetadataDataStreamsServiceTests.java # server/src/test/java/org/elasticsearch/cluster/metadata/MetadataIndexTemplateServiceTests.java
gmarouli
added a commit
that referenced
this pull request
Apr 23, 2025
The class `DataStreamLifecycle` is currently capturing the lifecycle configuration that currently manages all data stream indices, but soon enough it will be split into two variants, the data and the failures lifecycle. Some pre-work has been done already but as we are progressing in our POC, we see that it will be really useful if the `DataStreamLifecycle` is "aware" of the target index component. This will allow us to correctly apply global retention or to throw an error if a downsampling configuration is provided to a failure lifecycle. In this PR, we perform a small refactoring to reduce the noise in #125658. Here we introduce the following: - A factory method that creates a data lifecycle, for now it's trivial but it will be more useful soon. - We rename the "empty" builder to explicitly mention the index component it refers to. (cherry picked from commit b991708) # Conflicts: # modules/data-streams/src/test/java/org/elasticsearch/datastreams/lifecycle/DataStreamLifecycleServiceTests.java # server/src/test/java/org/elasticsearch/cluster/metadata/MetadataDataStreamsServiceTests.java # server/src/test/java/org/elasticsearch/cluster/metadata/MetadataIndexTemplateServiceTests.java
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
:Data Management/Data streams
Data streams and their lifecycles
>non-issue
serverless-linked
Added by automation, don't add manually
Team:Data Management
Meta label for data/management team
v9.1.0
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
The failure store is a set of data stream indices that are used to store certain type of ingestion failures. Until this moment they were sharing the configuration of the backing indices. We understand that the two data sets have different lifecycle needs.
We believe that typically the failures will need to be retained much less than the data. Considering this we believe the lifecycle needs of the failures also more limited and they fit better the simplicity of the data stream lifecycle feature.
This allows the user to only set the desired retention and we will perform the rollover and other maintenance tasks without the user having to think about them. Furthermore, having only one lifecycle management feature allows us to ensure that these data is managed by default.
This PR introduces the following:
Configuration
We extend the failure store configuration to allow lifecycle configuration too, this configuration reflects the user's configuration only as shown below:
To retrieve the effective configuration you need to use the
GET
data streams API, see #126668Functionality
GET
data stream API should be used to check the current state of the effective failure store configuration.Telemetry
We extend the data stream failure store telemetry to also include the lifecycle telemetry.