[sigevents][kis] Unify features and queries in the same data stream by klacabane · Pull Request #270979 · elastic/kibana

klacabane · 2026-05-25T12:15:53Z

Summary

Ports features and queries from two separate mutable storage backends to a single unified, append-only Knowledge Indicators (KI) data stream, maintaining feature parity. The old FeatureClient and QueryClient are removed and replaced by a single
KnowledgeIndicatorClient.

Model

One hidden data stream (.significant_events-knowledge_indicators) holds both feature and query documents, discriminated by type.
Identity is (stream.name, type, id). State is reconstructed by selecting the latest revision per group (two-stage INLINE STATS — MAX(@timestamp) then MAX(_id) tiebreak).
Writes are append-only: updates add a revision; deletes append a tombstone (deleted: true).
Reads filter on the latest revision (drop tombstoned / excluded / expired) after the per-group reduction.
Feature shape: uuid / status / last_seen / excluded_at (timestamp) are gone; identity is id, with updated_at (revision time) and excluded (boolean).

Testing

Enable Streams: significant events (Advanced Settings) in a space with an Enterprise license; confirm the Significant events tab loads.
Identify features on a stream → features appear; exclude one (row action) → moves to the Excluded tab; restore it → it disappears (re-derived on the next extraction); delete one → gone.
Generate/persist queries → they appear; high-severity non-STATS queries create backing alerting rules; promote/demote/delete behave accordingly.
Multi-stream discovery view: with features sharing an id across streams, confirm they render as distinct rows, applying/removing filters produces no duplicate/ghost rows, and bulk exclude/restore/delete/promote target the correct stream.
Search KIs (keyword + semantic) returns only latest revisions, no duplicates, and respects the active/excluded filter.

github-actions · 2026-05-27T14:30:41Z

@klacabane, this PR increases one or more page-load bundle sizes by 15% or more:

Plugin	Before (bytes)	After (bytes)	Change
`agentBuilderPlatform`	8,737	15,544	+77.9%
`globalSearchBar`	26,122	31,212	+19.5%

Large bundle size increases can affect page load performance. Consider whether dependencies can be lazy-loaded or code split to reduce the bundle.

See the bundle optimization guide for tips.

This reverts commit 188f879.

-    excluded_at: z.string().optional(),
    run_id: z.string().optional(),
+    excluded: z.boolean().optional(),
+    updated_at: z.string().optional(),


    run_id: z.string().optional(),
+    excluded: z.boolean().optional(),
+    updated_at: z.string().optional(),
+    expires_at: z.string().optional(),


+
+const featureBulkOperationSchema = z.union([
+  z.object({ index: z.object({ feature: featureSchema }) }),
+  z.object({ delete: z.object({ id: z.string() }) }),


+const featureBulkOperationSchema = z.union([
+  z.object({ index: z.object({ feature: featureSchema }) }),
+  z.object({ delete: z.object({ id: z.string() }) }),
+  z.object({ exclude: z.object({ id: z.string() }) }),


+  z.object({ index: z.object({ feature: featureSchema }) }),
+  z.object({ delete: z.object({ id: z.string() }) }),
+  z.object({ exclude: z.object({ id: z.string() }) }),
+  z.object({ restore: z.object({ id: z.string() }) }),


+
+const featureBulkAcrossStreamsOperationSchema = z.union([
+  z.object({ delete: z.object({ id: z.string(), stream_name: z.string() }) }),
+  z.object({ exclude: z.object({ id: z.string(), stream_name: z.string() }) }),


+const featureBulkAcrossStreamsOperationSchema = z.union([
+  z.object({ delete: z.object({ id: z.string(), stream_name: z.string() }) }),
+  z.object({ exclude: z.object({ id: z.string(), stream_name: z.string() }) }),
+  z.object({ restore: z.object({ id: z.string(), stream_name: z.string() }) }),


+const featureBulkAcrossStreamsOperationSchema = z.union([
+  z.object({ delete: z.object({ id: z.string(), stream_name: z.string() }) }),
+  z.object({ exclude: z.object({ id: z.string(), stream_name: z.string() }) }),
+  z.object({ restore: z.object({ id: z.string(), stream_name: z.string() }) }),


  },
  params: z.object({
-    path: z.object({ name: z.string(), uuid: z.string() }),
+    path: z.object({ name: z.string(), id: z.string() }),


  },
  params: z.object({
-    path: z.object({ name: z.string(), uuid: z.string() }),
+    path: z.object({ name: z.string(), id: z.string() }),


kibanamachine · 2026-05-29T15:56:15Z

💔 Build Failed

Buildkite Build
Commit: 2eb75f8
Build duration: 32 mins

Failed CI Steps

Check Types

Metrics [docs]

Module Count

Fewer modules leads to a faster build time

id	before	after	diff
`streamsApp`	1973	1974	+1

Async chunks

Total size of all lazy-loaded chunks that will be downloaded as the user navigates the app

id	before	after	diff
`datasetQuality`	541.1KB	541.1KB	-60.0B
`streamsApp`	2.1MB	2.1MB	+415.0B
total			+355.0B

History

crespocarlos · 2026-06-01T09:24:14Z

+    const deletableOps: Array<Extract<KIBulkOperation, { delete: unknown }>> = [];
+    let deleteSkipped = 0;
+    for (const op of deleteOps) {
+      if (deleteLatest.find((doc) => doc.id === op.delete.id)) {


Suggested change

if (deleteLatest.find((doc) => doc.id === op.delete.id)) {

if (deleteLatest.some((doc) => doc.id === op.delete.id)) {

crespocarlos · 2026-06-01T09:26:16Z

+      const latest = docById.get(key);
+      if (
+        !latest ||
+        new Date(latest['@timestamp']).getTime() !== new Date(source['@timestamp']).getTime()


q: We don't need a tiebreaker here by _id, right?

crespocarlos · 2026-06-01T09:35:07Z

+    query = withSort(query, sort);
+    // Cap at REVISION_SIZE_LIMIT regardless of the requested limit so a large
+    // caller-supplied value can't fetch an unbounded result set.
+    query = query.keep('_source').limit(Math.min(limit, REVISION_SIZE_LIMIT));


q: should we log a warning in case limit happens to be greater than REVISION_SIZE_LIMIT?

crespocarlos · 2026-06-01T09:59:58Z

+      const docs: StoredKnowledgeIndicator[] = [];
+      for (const op of operations) {
+        if ('index' in op) {
+          if ('feature' in op.index) {


nit: we're doing this again a few lines above

crespocarlos · 2026-06-01T10:01:23Z

+    private readonly ttlDays: number
+  ) {}
+
+  async bulk(


this function seems to be doing a lot of things. I wonder if its content should be broken into smaller functions

crespocarlos · 2026-06-01T10:07:41Z

+  query = withTimeRange(query, options);
+  if (where) query = withWhere(query, where);
+  query = pickLatestPerGroup(query, groupBy);
+  const sortArgs: ComposerSortShorthand[] = sort ?? [['@timestamp', 'DESC']];


is there a constant for @timestamp?

crespocarlos · 2026-06-01T10:08:16Z

-  query = query.keep('_source');
+  const query = latestSourceFrom(index, space).where`${esql.col(idField)} == ${esql.str(idValue)}`
+    .sort(['@timestamp', 'ASC'])
+    .keep('_source');


Is there a constant for _source?

crespocarlos · 2026-06-01T10:12:57Z

+    const wildcard = (field: string, boost?: number) => ({
+      wildcard: {
+        [field]: {
+          value: `*${escaped}*`,


I guess leading * will perform a full term scan, and could have performance implications

crespocarlos · 2026-06-01T10:15:28Z

+}
+
+function computeExpiresAt(timestamp: string, ttlDays: number): string {
+  return new Date(new Date(timestamp).getTime() + ttlDays * 24 * 60 * 60 * 1000).toISOString();


nit: Could we extract this 24 * 60 * 60 * 1000 in to a constant?

crespocarlos · 2026-06-01T10:16:19Z

+  const parts: string[] = [`Stream: ${streamName}`];
+  if (feature.title) parts.push(`Title: ${feature.title}`);
+  if (feature.description) parts.push(`Description: ${feature.description}`);
+  if (feature.type) parts.push(`Type: ${feature.type}`);
+  if (feature.subtype) parts.push(`Subtype: ${feature.subtype}`);


nit: we could have constants for these texts "Stream" , "Title", etc.

klacabane · 2026-06-01T11:24:56Z

Closing as this will be split up in two smaller changes

@timestamp

## Summary Additive foundation for #270979 — introduces the **unified Knowledge Indicators (KI) data stream** as a new storage backend for features and queries, without touching any existing code. Existing `FeatureClient` and `QueryClient` paths remain fully active. **Model** - One hidden data stream (`.significant_events-knowledge_indicators`) will hold both `feature` and `query` documents, discriminated by `type`. - Identity is `(stream.name, type, id)`. State is reconstructed by selecting the **latest revision per group** (two-stage `INLINE STATS` — `MAX(@timestamp)` then `MAX(_id)` tiebreak). - Writes are append-only: updates add a revision; deletes append a tombstone (`deleted: true`). - Reads filter on the latest revision (drop tombstoned / excluded / expired) after the per-group reduction. - Supports **keyword + semantic hybrid search** across indicators. The index template is installed at Kibana startup so the data stream is ready when callers are migrated over. --- ## Testing This PR has **no behavior change**. All existing routes continue to use `FeatureClient` and `QueryClient`. - verify the index template was installed at startup: GET /_index_template/.significant_events-knowledge_indicators --------- Co-authored-by: kibanamachine <42973632+kibanamachine@users.noreply.github.com>

@timestamp

## Summary Additive foundation for elastic#270979 — introduces the **unified Knowledge Indicators (KI) data stream** as a new storage backend for features and queries, without touching any existing code. Existing `FeatureClient` and `QueryClient` paths remain fully active. **Model** - One hidden data stream (`.significant_events-knowledge_indicators`) will hold both `feature` and `query` documents, discriminated by `type`. - Identity is `(stream.name, type, id)`. State is reconstructed by selecting the **latest revision per group** (two-stage `INLINE STATS` — `MAX(@timestamp)` then `MAX(_id)` tiebreak). - Writes are append-only: updates add a revision; deletes append a tombstone (`deleted: true`). - Reads filter on the latest revision (drop tombstoned / excluded / expired) after the per-group reduction. - Supports **keyword + semantic hybrid search** across indicators. The index template is installed at Kibana startup so the data stream is ready when callers are migrated over. --- ## Testing This PR has **no behavior change**. All existing routes continue to use `FeatureClient` and `QueryClient`. - verify the index template was installed at startup: GET /_index_template/.significant_events-knowledge_indicators --------- Co-authored-by: kibanamachine <42973632+kibanamachine@users.noreply.github.com>

@timestamp

## Summary Additive foundation for elastic#270979 — introduces the **unified Knowledge Indicators (KI) data stream** as a new storage backend for features and queries, without touching any existing code. Existing `FeatureClient` and `QueryClient` paths remain fully active. **Model** - One hidden data stream (`.significant_events-knowledge_indicators`) will hold both `feature` and `query` documents, discriminated by `type`. - Identity is `(stream.name, type, id)`. State is reconstructed by selecting the **latest revision per group** (two-stage `INLINE STATS` — `MAX(@timestamp)` then `MAX(_id)` tiebreak). - Writes are append-only: updates add a revision; deletes append a tombstone (`deleted: true`). - Reads filter on the latest revision (drop tombstoned / excluded / expired) after the per-group reduction. - Supports **keyword + semantic hybrid search** across indicators. The index template is installed at Kibana startup so the data stream is ready when callers are migrated over. --- ## Testing This PR has **no behavior change**. All existing routes continue to use `FeatureClient` and `QueryClient`. - verify the index template was installed at startup: GET /_index_template/.significant_events-knowledge_indicators --------- Co-authored-by: kibanamachine <42973632+kibanamachine@users.noreply.github.com>

klacabane and others added 8 commits May 25, 2026 12:11

initial commit

6fa49a0

feature exclusion

0523ac3

refactor and bugfixes

b3b422a

Changes from node scripts/eslint_all_files --no-cache --fix

e3bf1d4

remove space scoping

45d460e

fixes

5ac605f

split the client

a3429cd

remove cols

c512e7b

klacabane mentioned this pull request May 25, 2026

[sigevents][kis] Unify features and queries in the same data stream #270450

Closed

klacabane added 6 commits May 25, 2026 12:31

fix registerFeatureFlags

4f8957a

add expires_at property

5cb45b4

fixes

2ed8437

add space to the revision reader

9d1bf6e

bug fixes

e01f392

Merge branch 'main' into sigevents_unified-ki-datastream-v2

188f879

klacabane and others added 9 commits May 27, 2026 14:32

Revert "Merge branch 'main' into sigevents_unified-ki-datastream-v2"

61bec1b

This reverts commit 188f879.

Merge branch 'main' into sigevents_unified-ki-datastream-v2

266bdf2

Merge branch 'main' into sigevents_unified-ki-datastream-v2

b64bdcc

Changes from node scripts/check

15c626b

lint

3c0ebb3

fix comments and space awareness

c42b9ac

fix tests

8c544e3

fixes

4053eb2

ui

9b9d1d3

klacabane added 7 commits May 28, 2026 15:57

remove unknwon

6fcdb3d

tighter types

173ad77

fix

ad1dbb9

fixes

678e1ca

align with previous behavior

0cd8fc5

more fixesé

349f293

tool regression

f1beaf3

klacabane marked this pull request as ready for review May 29, 2026 12:01

klacabane requested review from a team as code owners May 29, 2026 12:01

github-advanced-security AI found potential problems May 29, 2026

View reviewed changes

klacabane and others added 7 commits May 29, 2026 12:28

comment

48b1eb1

lint

1977c77

Changes from node scripts/eslint_all_files --no-cache --fix

4cc5c33

Merge branch 'main' into sigevents_unified-ki-datastream-v2

4158391

Merge branch 'main' into sigevents_unified-ki-datastream-v2

d30f95e

Changes from node scripts/eslint_all_files --no-cache --fix

4257875

fix tests and revert restore behavior

2eb75f8

crespocarlos reviewed Jun 1, 2026

View reviewed changes

klacabane mentioned this pull request Jun 1, 2026

[sigevents] KI unified infrastructure #272100

Merged

klacabane closed this Jun 1, 2026

	if (deleteLatest.find((doc) => doc.id === op.delete.id)) {
	if (deleteLatest.some((doc) => doc.id === op.delete.id)) {

Uh oh!

Conversation

klacabane commented May 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Testing

Uh oh!

github-actions Bot commented May 27, 2026

Uh oh!

kibanamachine commented May 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

💔 Build Failed

Failed CI Steps

Metrics [docs]

Module Count

Async chunks

History

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

klacabane commented Jun 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

klacabane commented May 25, 2026 •

edited

Loading

kibanamachine commented May 29, 2026 •

edited

Loading