Skip to content

[Security Solution][Detection Engine] Fix enrichment tests for Entity Store V2 on MKI#271093

Open
abhishekbhatia1710 wants to merge 24 commits into
elastic:mainfrom
abhishekbhatia1710:ea-fix-detection-enrichment-v2-mki
Open

[Security Solution][Detection Engine] Fix enrichment tests for Entity Store V2 on MKI#271093
abhishekbhatia1710 wants to merge 24 commits into
elastic:mainfrom
abhishekbhatia1710:ea-fix-detection-enrichment-v2-mki

Conversation

@abhishekbhatia1710
Copy link
Copy Markdown
Contributor

@abhishekbhatia1710 abhishekbhatia1710 commented May 25, 2026

Summary

Closes https://github.com/elastic/security-team/issues/17408 (internal)

Detection engine enrichment tests were failing on MKI (`@serverlessQA`) because they relied on legacy V1 risk-score and asset-criticality indices absent when Entity Store V2 is enabled.

This PR fixes all affected enrichment tests by using Entity Store V2 to seed the required entities.

What changed

  • Removes `disable:entityAnalyticsEntityStoreV2` from the ESS and serverless base configs so Entity Store V2 is enabled in all test environments
  • Introduces `EntityStoreV2EnrichmentSetup` helper (`entity_store_v2_enrichment_setup.ts`) that installs Entity Store V2 engines and seeds host/user entities with the required risk score and asset criticality data
  • Updates enrichment `before`/`after` hooks across all detection rule type tests: EQL, ES|QL, Query, Indicator Match, New Terms, Threshold (both standard and alert suppression variants)
  • Entity creation uses `refresh: wait_for` internally so seeded entities are immediately searchable when `setup` returns

Approach

On all environments (ESS, serverless, MKI), entities are seeded directly into Entity Store V2 using the correct EUID:

  • Auditbeat host records carry `host.id` -> id-based EUID (`host:<host.id>`) -> `entity.id` must be supplied explicitly
  • Dynamically created docs and user records lack an id field -> name-based EUID -> `entity.id` is auto-generated

Supersedes the skip-based approach in #270974 with a proper fix.
Builds on the approach started in #270939 by @denar50.

@abhishekbhatia1710 abhishekbhatia1710 marked this pull request as ready for review May 25, 2026 14:13
@abhishekbhatia1710 abhishekbhatia1710 requested a review from a team as a code owner May 25, 2026 14:13
@abhishekbhatia1710 abhishekbhatia1710 self-assigned this May 25, 2026
@abhishekbhatia1710 abhishekbhatia1710 added release_note:skip Skip the PR/issue when compiling release notes backport:skip This PR does not require backporting Team:Entity Analytics Security Entity Analytics Team labels May 25, 2026
@infra-vault-gh-plugin-prod
Copy link
Copy Markdown

Pinging @elastic/security-entity-analytics (Team:Entity Analytics)

@abhishekbhatia1710 abhishekbhatia1710 force-pushed the ea-fix-detection-enrichment-v2-mki branch from 71ba1f9 to 18dfcbc Compare May 26, 2026 08:04
@ymao1
Copy link
Copy Markdown
Contributor

ymao1 commented May 26, 2026

I think we should remove the check for V1 vs V2, remove the feature flag override and just update the tests for V2 entity store, WDTY?

@abhishekbhatia1710
Copy link
Copy Markdown
Contributor Author

abhishekbhatia1710 commented May 27, 2026

Done @ymao1 ! Removed the entityStoreV2Installed boolean and all V1 fallback branches across all the test files, and changed setup() to return void instead of boolean. Tests now always go through Entity Store V2.

@ymao1
Copy link
Copy Markdown
Contributor

ymao1 commented May 27, 2026

I think we can try to remove the 'disable:entityAnalyticsEntityStoreV2 in these files

x-pack/solutions/security/test/security_solution_api_integration/config/ess/config.base.ts
x-pack/solutions/security/test/security_solution_api_integration/config/serverless/config.base.ts

and then in the test files, where a test is tagged @skipInServerlessMKI, we should be able to remove that as well.

@abhishekbhatia1710
Copy link
Copy Markdown
Contributor Author

The failures in the current build are in the `with host risk index` / `alerts enrichment` describe blocks across EQL, ES|QL, indicator match, new terms, query, and threshold — exactly the blocks where we removed the V1 fallback. On `@ess` and `@serverless` (non-MKI) environments, `entityStoreV2.setup()` is throwing because Entity Store V2 isn't available there.

Worth noting: builds 447589 and 447911 (original code, with the V1 fallback still in place) also failed, so these tests were already broken on ESS even before the refactor.

A few options for how to proceed — happy to go whichever direction you prefer:

  1. Restrict to @serverlessQA only — remove the @ess @serverless tags from the enrichment describe blocks (or add @skipInStateful) since they require Entity Store V2 which isn't available in those environments
  2. Soft skip instead of throw — restore a boolean return from setup() and call this.skip() inside before when V2 isn't available, so the tests are skipped rather than failing on ESS
  3. Get V2 enabled on ESS CI — if V2 should be available on ESS, fix the environment config upstream

WDYT?

@ymao1
Copy link
Copy Markdown
Contributor

ymao1 commented May 28, 2026

On @ess and @serverless (non-MKI) environments, entityStoreV2.setup() is throwing because Entity Store V2 isn't available there.

This should be resolved when we remove the global feature flag as suggested here #271093 (comment) right? We were deliberately disabling the V2 entity store in these configs before but now that we're relying on entity store V2, we should remove that

…rate indicator_match_alert_suppression enrichment tests to V2
@abhishekbhatia1710 abhishekbhatia1710 requested a review from a team as a code owner May 28, 2026 13:10
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 28, 2026

Actionable comments posted: 0

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 28, 2026

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Enterprise

Run ID: 6a372222-9600-4462-b4cb-0322d4f97fa2

📥 Commits

Reviewing files that changed from the base of the PR and between 14a69c3 and e8b1ad7.

📒 Files selected for processing (14)
  • x-pack/solutions/security/test/security_solution_api_integration/config/ess/config.base.ts
  • x-pack/solutions/security/test/security_solution_api_integration/config/serverless/config.base.ts
  • x-pack/solutions/security/test/security_solution_api_integration/test_suites/detections_response/detection_engine/rule_execution_logic/entity_store_v2_enrichment_setup.ts
  • x-pack/solutions/security/test/security_solution_api_integration/test_suites/detections_response/detection_engine/rule_execution_logic/eql/trial_license_complete_tier/eql.ts
  • x-pack/solutions/security/test/security_solution_api_integration/test_suites/detections_response/detection_engine/rule_execution_logic/eql/trial_license_complete_tier/eql_alert_suppression.ts
  • x-pack/solutions/security/test/security_solution_api_integration/test_suites/detections_response/detection_engine/rule_execution_logic/esql/trial_license_complete_tier/esql.ts
  • x-pack/solutions/security/test/security_solution_api_integration/test_suites/detections_response/detection_engine/rule_execution_logic/esql/trial_license_complete_tier/esql_suppression.ts
  • x-pack/solutions/security/test/security_solution_api_integration/test_suites/detections_response/detection_engine/rule_execution_logic/indicator_match/trial_license_complete_tier/indicator_match.ts
  • x-pack/solutions/security/test/security_solution_api_integration/test_suites/detections_response/detection_engine/rule_execution_logic/indicator_match/trial_license_complete_tier/indicator_match_alert_suppression.ts
  • x-pack/solutions/security/test/security_solution_api_integration/test_suites/detections_response/detection_engine/rule_execution_logic/new_terms/trial_license_complete_tier/new_terms.ts
  • x-pack/solutions/security/test/security_solution_api_integration/test_suites/detections_response/detection_engine/rule_execution_logic/new_terms/trial_license_complete_tier/new_terms_alert_suppression.ts
  • x-pack/solutions/security/test/security_solution_api_integration/test_suites/detections_response/detection_engine/rule_execution_logic/query/trial_license_complete_tier/custom_query.ts
  • x-pack/solutions/security/test/security_solution_api_integration/test_suites/detections_response/detection_engine/rule_execution_logic/threshold/trial_license_complete_tier/threshold.ts
  • x-pack/solutions/security/test/security_solution_api_integration/test_suites/detections_response/detection_engine/rule_execution_logic/threshold/trial_license_complete_tier/threshold_alert_suppression.ts
💤 Files with no reviewable changes (2)
  • x-pack/solutions/security/test/security_solution_api_integration/config/ess/config.base.ts
  • x-pack/solutions/security/test/security_solution_api_integration/config/serverless/config.base.ts

📝 Walkthrough

Walkthrough

This PR removes the experimental feature gate disable:entityAnalyticsEntityStoreV2 from ESS and serverless test configurations, introduces a new EntityStoreV2EnrichmentSetup test helper factory for deterministic entity store provisioning, and migrates 11 detection rule test suites from Elasticsearch archive-based enrichment fixtures to dynamic host/user entity seeding. The helper provides typed configuration interfaces for host and user entity payloads, implements install/seed/poll/uninstall lifecycle management with error handling, and enables test suites to assert enriched alert fields (host risk levels, calculated scores, asset criticality) from consistently provisioned entity store data instead of loaded fixtures.

Suggested labels

Team:Threat Hunting, reviewer:coderabbit

🚥 Pre-merge checks | ✅ 2
✅ Passed checks (2 passed)
Check name Status Explanation
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • 🛠️ Update Documentation

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@rylnd rylnd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had a few comments about the new v2 test service.

Also: I don't see any code related to the "fallback" behavior outlined in the PR description. Am I missing something?

@abhishekbhatia1710 abhishekbhatia1710 requested a review from a team as a code owner May 29, 2026 08:03
@abhishekbhatia1710 abhishekbhatia1710 requested a review from ymao1 May 29, 2026 08:03
@abhishekbhatia1710 abhishekbhatia1710 requested a review from rylnd May 29, 2026 08:33
@abhishekbhatia1710
Copy link
Copy Markdown
Contributor Author

abhishekbhatia1710 commented May 29, 2026

Also: I don't see any code related to the "fallback" behavior outlined in the PR description. Am I missing something?

Hi @rylnd, thanks for your review! The original PR description mentioned a fallback to legacy esArchiver archives when V2 was disabled, but that code no longer exists. I updated the PR description to remove that reference. The current implementation always uses Entity Store V2 directly (no fallback). The CI failures we were seeing turned out to be caused by missing entity.id on user entities (the create API requires it explicitly, it cannot auto-derive the EUID from user.name alone), which is now fixed.

@kibanamachine
Copy link
Copy Markdown
Contributor

kibanamachine commented May 29, 2026

💔 Build Failed

Failed CI Steps

Test Failures

  • [job] [logs] FTR Configs #205 / EQL execution logic API @ess @serverless Alert Suppression for EQL rules non-sequence queries alert enrichment suppressed alerts are enriched with criticality_level
  • [job] [logs] FTR Configs #96 / EQL execution logic API @ess @serverless Alert Suppression for EQL rules non-sequence queries alert enrichment suppressed alerts are enriched with criticality_level
  • [job] [logs] FTR Configs #205 / EQL execution logic API @ess @serverless Alert Suppression for EQL rules non-sequence queries alert enrichment suppressed alerts are enriched with criticality_level
  • [job] [logs] FTR Configs #96 / EQL execution logic API @ess @serverless Alert Suppression for EQL rules non-sequence queries alert enrichment suppressed alerts are enriched with criticality_level
  • [job] [logs] FTR Configs #58 / Indicator match execution logic API @ess @serverless @serverlessQA Threat match type rules with asset criticality "before all" hook for "should be enriched alert with criticality_level"
  • [job] [logs] FTR Configs #204 / Indicator match execution logic API @ess @serverless @serverlessQA Threat match type rules with asset criticality "before all" hook for "should be enriched alert with criticality_level"
  • [job] [logs] FTR Configs #58 / Indicator match execution logic API @ess @serverless @serverlessQA Threat match type rules with asset criticality "before all" hook for "should be enriched alert with criticality_level"
  • [job] [logs] FTR Configs #204 / Indicator match execution logic API @ess @serverless @serverlessQA Threat match type rules with asset criticality "before all" hook for "should be enriched alert with criticality_level"
  • [job] [logs] FTR Configs #45 / Machine learning rule execution logic API @ess @serverless Machine learning type rules with asset criticality should be enriched alert with criticality_level
  • [job] [logs] FTR Configs #170 / Machine learning rule execution logic API @ess @serverless Machine learning type rules with asset criticality should be enriched alert with criticality_level
  • [job] [logs] FTR Configs #45 / Machine learning rule execution logic API @ess @serverless Machine learning type rules with asset criticality should be enriched alert with criticality_level
  • [job] [logs] FTR Configs #170 / Machine learning rule execution logic API @ess @serverless Machine learning type rules with asset criticality should be enriched alert with criticality_level
  • [job] [logs] FTR Configs #23 / New terms rule execution logic API @ess @serverless @serverlessQA New terms type rules with asset criticality should be enriched alert with criticality_level
  • [job] [logs] FTR Configs #68 / New terms rule execution logic API @ess @serverless @serverlessQA New terms type rules with asset criticality should be enriched alert with criticality_level
  • [job] [logs] FTR Configs #68 / New terms rule execution logic API @ess @serverless @serverlessQA New terms type rules with asset criticality should be enriched alert with criticality_level
  • [job] [logs] FTR Configs #23 / New terms rule execution logic API @ess @serverless @serverlessQA New terms type rules with asset criticality should be enriched alert with criticality_level
  • [job] [logs] FTR Configs #26 / Query rule execution logic API @ess @serverless @serverlessQA Query type rules with host and user risk indices "before all" hook for "should have host and user risk score fields"
  • [job] [logs] FTR Configs #28 / Query rule execution logic API @ess @serverless @serverlessQA Query type rules with host and user risk indices "before all" hook for "should have host and user risk score fields"
  • [job] [logs] FTR Configs #26 / Query rule execution logic API @ess @serverless @serverlessQA Query type rules with host and user risk indices "before all" hook for "should have host and user risk score fields"
  • [job] [logs] FTR Configs #28 / Query rule execution logic API @ess @serverless @serverlessQA Query type rules with host and user risk indices "before all" hook for "should have host and user risk score fields"

Metrics [docs]

✅ unchanged

History

cc @abhishekbhatia1710

Copy link
Copy Markdown
Contributor

@rylnd rylnd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM once the current CI is green!

abhishekbhatia1710 and others added 2 commits June 1, 2026 13:54
… race condition

Two root causes addressed for continuous CI failures in build 450343:

1. **"before all" hook failures** (Indicator Match elastic#58/elastic#204, Query Rule elastic#26/elastic#28):
   The setup was only returning `false` for HTTP 404/503 from the install API.
   Any other non-success status (e.g. 403, 400) caused it to throw, crashing
   the `before all` hook in non-MKI serverless environments.
   Fix: return `false` for ANY non-200/201 status code.

2. **Assertion failures due to race condition** (ML elastic#45/elastic#170, EQL suppression
   elastic#96/elastic#205, New Terms elastic#23/elastic#68): The previous commit changed `waitForWithTimeout`
   to accept `running || stopped`. This allowed entity seeding to happen while
   the initial engine scan was still `running`, causing the engine to overwrite
   test entities before the detection rule executed.
   Fix: revert to waiting strictly for `stopped`, and catch timeouts gracefully
   (returning `false` to skip rather than crashing).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backport:skip This PR does not require backporting release_note:skip Skip the PR/issue when compiling release notes Team:Entity Analytics Security Entity Analytics Team

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants