Skip to content

Add documentation for failure stores. #1368

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 17 commits into
base: main
Choose a base branch
from
Draft

Conversation

jbaiera
Copy link
Member

@jbaiera jbaiera commented May 5, 2025

Adds a new section to the documentation to explain new failure store functionality.

Preview:
https://docs-v3-preview.elastic.dev/elastic/docs-content/pull/1368/manage-data/data-store/data-streams/failure-store

Work in progress:

  • Recipes section is currently TBD and open for suggestions.
  • Most links are not complete and need updating from "???".

TBD on recipes. Most links are not complete and need updating from "???".

If you have a large number of existing data streams you may want an easier way to control if failures should be redirected. Instead of enabling the failure store using the [put data stream options](./failure-store.md) API, you can instead configure a set of patterns in the [cluster settings](./failure-store.md) which will enable the failure store feature by default.

Configure a list of patterns using the `data_streams.failure_store.enabled` dynamic cluster setting. If a data stream matches a pattern in this setting and does not have the failure store explicitly disabled in its options, then the failure store will default to being enabled for that matching data stream.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The documentation here should mention whether this settings applies retroactively to pre-existing data streams, or only on data stream creation.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was wondering that as well when reading this.

then the failure store will default to being enabled for that matching data stream

Makes it sounds like it's not applying to existing data streams, just act as a default.

If you have a large number of existing data streams [...] you can instead configure a set of patterns in the cluster settings

Makes it sound the setting is an alternative to enabling the failure store one by one via the DS options API if you have a large number of existing data streams.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I tried to phrase this in a way that made it not seem like setting the property was the same as toggling the feature on permanently. Matching data streams only enable their failure stores so long as they are not explicitly disabled in the options, and only for as long as they match the setting.

I've put up an edit that should simplify the explanation a bit. I also added an example of the explicit disabling of the failure store overriding the cluster setting.


If you have a large number of existing data streams you may want an easier way to control if failures should be redirected. Instead of enabling the failure store using the [put data stream options](./failure-store.md) API, you can instead configure a set of patterns in the [cluster settings](./failure-store.md) which will enable the failure store feature by default.

Configure a list of patterns using the `data_streams.failure_store.enabled` dynamic cluster setting. If a data stream matches a pattern in this setting and does not have the failure store explicitly disabled in its options, then the failure store will default to being enabled for that matching data stream.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was wondering that as well when reading this.

then the failure store will default to being enabled for that matching data stream

Makes it sounds like it's not applying to existing data streams, just act as a default.

If you have a large number of existing data streams [...] you can instead configure a set of patterns in the cluster settings

Makes it sound the setting is an alternative to enabling the failure store one by one via the DS options API if you have a large number of existing data streams.


### Add and remove from failure store [manage-failure-store-indices]

Failure stores support adding and removing indices from them using the [modify data stream](./failure-store.md) API.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if you add a failure store backing index that has incompatible mappings? Are we doing validations when adding a backing index or would it fail at runtime? Not suggesting one way is better than the other but maybe we should describe what happens here to set expectations.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's no special handling for mappings when adding an index to the failure store. You could add a completely unrelated index to the failure store and we allow it. Indices that are added to a data stream can never be treated as a write index, so we're less worried about their mappings than when doing a rollover operation. Even if the failure store is empty and we add a random index, the failure store is still marked for lazy rollover and will create a write index on redirection.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation wip
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants