Skip to content

Fix per-message MongoDB query caused by TimeStampConfig null cache issue#25344

Merged
danotorrey merged 5 commits intomasterfrom
fix/timestamp-config-cache-null-bug
Mar 17, 2026
Merged

Fix per-message MongoDB query caused by TimeStampConfig null cache issue#25344
danotorrey merged 5 commits intomasterfrom
fix/timestamp-config-cache-null-bug

Conversation

@danotorrey
Copy link
Contributor

@danotorrey danotorrey commented Mar 17, 2026

Summary

Saving the Message Processors configuration causes MongoDB (cluster_config collection) to be queried on every ingested message, indefinitely. The grace period cache in ProcessBufferProcessor uses null to mean both "not yet loaded" and "loaded value was null." When the grace period is null (normalization disabled, the default), the cache can never store the result, so every message triggers a fresh MongoDB read.

It seems this was introduced when PR #24686 changed the TimeStampConfig default from a non-null sentinel (giant duration) value to null. Once changed to null, the cache could not distinguish it from "not loaded yet."

This fix will need to be backported to 7.0.

Closes https://github.com/Graylog2/graylog-plugin-enterprise/issues/13535

Changes

  • ProcessBufferProcessor.java: Added a gracePeriodLoaded boolean flag so the cache can distinguish "not loaded" from "loaded as null" (backwards-compatible handling)
  • MessageTimestampTest.java: Added 8 new tests covering null caching, transitions between enabled/disabled, and high-volume scenarios

Fixes

🤖 Generated with Claude Code

danotorrey and others added 3 commits March 16, 2026 21:26
The grace period cache in ProcessBufferProcessor uses null to mean both
"not yet loaded" and "loaded value was null." When the grace period is
null (normalization disabled, the default), the cache can never store
the result, so every message triggers a fresh MongoDB read.

- Add gracePeriodLoaded boolean flag to distinguish not-loaded from null
- Delete TimeStampConfig document when normalization is disabled instead
  of writing one with a null payload
- Add remove action to ConfigurationsStore for cluster config deletion

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@danotorrey danotorrey changed the title Fix per-message MongoDB query caused by TimeStampConfig null cache bug Fix per-message MongoDB query caused by TimeStampConfig null cache issue Mar 17, 2026
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes an issue where TimeStampConfig being null caused ProcessBufferProcessor to re-query MongoDB on every ingested message by making the grace-period cache able to represent “loaded null”.

Changes:

  • Server: cache TimeStampConfig.gracePeriod() correctly even when it is null, and invalidate cache on cluster-config change events.
  • Web UI: when timestamp normalization is disabled, delete the TimeStampConfig cluster config entry instead of writing a null payload; add a remove action to support this.
  • Tests/changelog: add regression tests covering null caching and update UI tests + changelog entry.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
graylog2-server/src/main/java/org/graylog2/shared/buffers/processors/ProcessBufferProcessor.java Fixes grace-period caching so null can be cached without repeated cluster-config reads.
graylog2-server/src/test/java/org/graylog2/shared/buffers/processors/MessageTimestampTest.java Adds tests verifying caching behavior for null/non-null and invalidation transitions.
graylog2-web-interface/src/stores/configurations/ConfigurationsStore.ts Adds remove action to delete cluster config entries from the UI.
graylog2-web-interface/src/components/configurations/message-processors/ProcessingConfigModalForm.tsx Switches disabled normalization behavior to delete the cluster config entry.
graylog2-web-interface/src/components/configurations/message-processors/ProcessingConfigModalForm.test.tsx Updates tests to expect deletion instead of writing grace_period: undefined.
changelog/unreleased/pr-25344.toml Documents the fix in the changelog.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Member

@kodjo-anipah kodjo-anipah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

backend looks good, but I think on the frontend side, we should just update it with the unfortunate default that is null, so that we don't need to worry about delete permissions on an update.

@danotorrey
Copy link
Contributor Author

backend looks good, but I think on the frontend side, we should just update it with the unfortunate default that is null, so that we don't need to worry about delete permissions on an update.

@kodjo-anipah Thanks for the feedback. Good point. Since the backend already handles that case, we can just keep that existing default of null. Makes sense. I'll work that in.

@danotorrey
Copy link
Contributor Author

@kodjo-anipah I went ahead and reverted all frontend changes, since the backend changes should now handle the null gracePeriod condition for backwards compatibility. Let me know if you see any other issues.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes an ingestion-time performance regression where TimeStampConfig being null caused ProcessBufferProcessor to query MongoDB (cluster_config) for every ingested message indefinitely, by making the grace-period cache able to distinguish “not loaded yet” from “loaded value is null”. The web UI is also adjusted to delete the TimeStampConfig document when future timestamp normalization is disabled.

Changes:

  • Server: cache null grace period values correctly in ProcessBufferProcessor and invalidate the cache on ClusterConfigChangedEvent.
  • Server: expand MessageTimestampTest to verify caching behavior for null, transitions, and repeated calls.
  • Web UI: add a cluster-config remove action and use it to delete TimeStampConfig when normalization is disabled.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated no comments.

Show a summary per file
File Description
graylog2-web-interface/src/stores/configurations/ConfigurationsStore.ts Adds remove(configType) action + store handler to DELETE cluster-config entries.
graylog2-web-interface/src/components/configurations/message-processors/ProcessingConfigModalForm.tsx Switches from updating TimeStampConfig with undefined to deleting it when disabled.
graylog2-web-interface/src/components/configurations/message-processors/ProcessingConfigModalForm.test.tsx Updates tests to expect deletion (remove) when normalization is disabled.
graylog2-server/src/main/java/org/graylog2/shared/buffers/processors/ProcessBufferProcessor.java Fixes grace-period caching so null is cached and doesn’t trigger per-message DB reads.
graylog2-server/src/test/java/org/graylog2/shared/buffers/processors/MessageTimestampTest.java Adds tests proving null/non-null caching and invalidation behavior.
changelog/unreleased/pr-25344.toml Adds changelog entry documenting the fix.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Member

@kodjo-anipah kodjo-anipah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for fixing this @danotorrey

The backend fix (caching null properly in ProcessBufferProcessor) is
sufficient on its own. The frontend delete approach introduced
unnecessary permission concerns per reviewer feedback.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
danotorrey added a commit that referenced this pull request Mar 17, 2026
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@danotorrey
Copy link
Contributor Author

I had forgotten to push the commit that reverted the frontend changes. Done. Merging now, and backport is up: #25350

@danotorrey danotorrey merged commit a11f9f5 into master Mar 17, 2026
23 checks passed
@danotorrey danotorrey deleted the fix/timestamp-config-cache-null-bug branch March 17, 2026 19:41
kodjo-anipah pushed a commit that referenced this pull request Mar 18, 2026
…l cache bug (7.0) (#25350)

* Fix per-message MongoDB query caused by TimeStampConfig null cache bug

The grace period cache in ProcessBufferProcessor uses null to mean both
"not yet loaded" and "loaded value was null." When the grace period is
null (normalization disabled, the default), the cache can never store
the result, so every message triggers a fresh MongoDB read.

- Add gracePeriodLoaded boolean flag to distinguish not-loaded from null
- Delete TimeStampConfig document when normalization is disabled instead
  of writing one with a null payload
- Add remove action to ConfigurationsStore for cluster config deletion

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* Add changelog entry for PR #25344

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* Update changelog wording

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* Remove frontend changes from cherry-pick (backend-only backport)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants