Commit 692454d
authored
[Entity Analytics][Leads generation] Lead generation EUID fixes (elastic#268666)
## Summary
Closes elastic/security-team#17246.
Four places inside the lead generation observation modules used the
entity display name as a unique identifier when querying Elasticsearch
and when joining results back to entities in memory. The display name is
not unique. The Entity Store EUID (entity.id) is.
In any deployment where two entities can share a name but have different
EUIDs, for example two local users called alice on different hosts, or
two tenants in different namespaces, the old code would either collapse
them into one bucket or fail to find a match for one of them. The test
fixtures all used unique names so nobody noticed.
The risk score path looked like it was working but only by accident. The
V2 risk score writer happens to put the EUID into host.name and
user.name on the time series index, and the lead generation lookup
happened to double prefix its key, so the two buggy strings matched.
Under the legacy writer, when entityAnalyticsEntityStoreV2 is off, the
same lookup silently returned nothing.
This change picks one rule and applies it everywhere. The EUID is the
identity. Queries filter on it, aggregations bucket by it, and the in
memory join key is just the EUID.
## What changed
1. LeadEntity now carries id as a first class field next to type and
name. entityRecordToLeadEntity reads it from entity.id. If a record
arrives without an EUID it is dropped at the boundary instead of being
downgraded to an unknown entity that would later collide with every
other unknown.
2. Behavioral analysis (alerts query) now computes the EUID as a
Painless runtime keyword field via euid.painless.getEuidRuntimeMapping
and filters and aggregates on that field. It no longer touches user.name
or host.name for identity. This mirrors the pattern already used in
calculate_esql_risk_scores.
3. Temporal state (history snapshot query) now filters and aggregates on
entity.id directly. The entity store history snapshot index already
carries entity.id, so the previous indirection through user.name and
host.name was not just wrong but also unnecessary.
4. Risk score (time series query) now filters on risk.id_field equal to
entity.id and aggregates by risk.id_value. This is the only filter that
authoritatively selects V2 shaped documents regardless of which writer
last ran. Legacy shaped documents are excluded from the lookback window,
which means the trend math is correct under both flag states and the
transitional data hazard goes away.
5. entityToKey now returns the EUID. The four call sites that used to
build inline strings like type colon name or type colon euid have been
updated to call entityToKey instead. The now dead getEntityId helper has
been removed along with its re export.
The LeadEntity type used to expose only type and name. The EUID lived
three property accesses deep inside record.entity.id. So the path of
least resistance for every new module was entity.name. The retrofit
helper getEntityId was added when the risk score module hit the EUID
boundary, but it was only adopted by that one module. The other two
modules carried on with entity.name. The tests confirmed the buggy code
rather than the requirement, because every fixture used unique names. By
adding id to LeadEntity as a peer of name, the natural ergonomic field
is now also the correct one.
## Automated checks already run
- [x] node scripts/jest
x-pack/solutions/security/plugins/security_solution/server/lib/entity_analytics/lead_generation
passes (17 suites, 163 tests).
- [x] node scripts/jest on
x-pack/solutions/security/plugins/security_solution/server/lib/entity_analytics/risk_score/risk_score_data_client.test.ts
passes (13 tests).
- [x] node scripts/eslint with fix on the diff reports no errors.
- [x] node scripts/type_check on the security solution tsconfig exits
clean.
New test coverage added for the failure modes the old tests masked:
- Two entities sharing a name but with distinct EUIDs receive separate
alert summaries (behavioral analysis).
- Same shape fixture for the temporal module proves only the correct
EUID is flagged for escalation, not its same named sibling.
- The data client filters on risk.id_field and excludes legacy shaped
documents from the trend.
- The risk score module produces current score observations even when no
historical buckets come back, which is the legacy writer environment.
## How to test this manually
These steps assume a local Kibana instance running against an
Elasticsearch with some Entity Store V2 data already present. If you
only have a tiny dataset, Test 1 alone is fine. Tests 2 and 3 are the
ones that actually exercise the bug class this PR fixes, so please do at
least one of them.
### Setup
1. Pull this branch and run yarn kbn bootstrap.
2. In your kibana.dev.yml make sure both flags are on:
```
xpack.securitySolution.enableExperimental: ['leadGenerationEnabled',
'entityAnalyticsEntityStoreV2']
```
3. Start Kibana and Elasticsearch and log in as a user with Security
Solution access.
4. Confirm the Entity Store has at least a handful of user and host
entities. The entity store list API or the entities UI is fine for this.
### Test 1: regression smoke test
Goal: confirm the pipeline still produces sensible leads after the
refactor.
1. Trigger lead generation. The simplest way is the API:
```
POST kbn:/internal/entity_analytics/leads/generate
```
This can be run from Dev Tools.
2. List the resulting leads:
```
GET kbn:/internal/entity_analytics/leads
```
3. Pick a few leads and confirm each one has at least one observation
attached and the observations look plausible (risk score level matches
the current risk score on the entity, alert counts look right for the
lookback, and so on).
4. Watch the Kibana server logs while the pipeline runs. There should be
no new warnings or errors that did not exist on main.
Pass criteria: leads come back, observations are populated, no new log
noise.
### Test 2: composite identity, the core scenario this PR fixes
Goal: prove that two entities that share a display name but have
different EUIDs no longer collide.
1. Pick or create two Entity Store records of the same type where
entity.name is identical but entity.id differs. For example two user
records both with entity.name set to alice but with entity.id set to
user:alice@corp.com and user:alice@partner.com respectively. The exact
way to create these depends on how your local cluster is wired. If you
have entity sources running, ingest two source documents that resolve to
different EUIDs but happen to share the display name. If you do not, you
can also write the two records directly into the entity store index for
the purposes of this test.
2. Make sure the two entities differ in at least one observable way. For
example put two alerts into the alerts index pointing at one of the two
EUIDs (use the runtime field resolution in mind: the alert needs
user.name plus enough identity fields to resolve to a specific EUID),
and leave the other entity with no alerts.
3. Trigger lead generation as in Test 1.
4. Look at the observations for both entities, either via the lead
documents or directly in the lead generation observation index.
Pass criteria: each entity gets its own observations attributed to its
own EUID. The entity with alerts gets the alert based observations. The
entity without alerts does not. Before this PR both entities would
either share one bucket or one of them would lose its own observations.
Negative version of this test if you want to verify with the previous
code: check out main, repeat the same setup, and you should see one
entity getting credit for the other entity's alerts, or both getting the
same combined observations.
### Test 3: risk score module under entityAnalyticsEntityStoreV2 off
Goal: prove that the risk score path no longer silently returns zero
data when the legacy writer is the one populating the time series index.
1. Turn the V2 flag off:
```
xpack.securitySolution.enableExperimental: ['leadGenerationEnabled']
```
leadGenerationEnabled stays on, entityAnalyticsEntityStoreV2 comes off.
2. Restart Kibana. On startup confirm in the logs that the legacy risk
scoring task was registered, not the V2 maintainer.
3. Make sure at least one entity has a current risk score on its entity
record. The current score lives on the entity record itself, not in the
time series index, so it should be there regardless of which writer is
active.
4. Trigger lead generation.
Pass criteria: current score observations such as high_risk_score,
moderate_risk_score, and privileged_high_risk are produced for entities
that qualify. Historical escalation observations (risk_escalation_24h,
risk_escalation_7d, risk_escalation_90d) are not produced, because there
are no V2 shaped time series documents in the lookback. No exceptions in
the logs. Specifically no warning about Failed to fetch time-series risk
scores.
Before this PR the same query would still run but match zero documents
(legacy shaped docs have risk.id_field set to host.name or user.name),
so the trend math would silently miss every escalation. After this PR
the same outcome is intentional and graceful rather than accidental.
### Test 4: missing EUID record is dropped at the boundary
Goal: prove that entities without an EUID no longer leak through as
unknown.
1. Manually create or temporarily edit an entity store record so
entity.id is missing or empty. Keep entity.name set.
2. Trigger lead generation with that entity included in the candidate
set.
3. Set the Kibana logger to debug level for the LeadGeneration namespace
and watch the logs.
Pass criteria: you see a single debug line that mentions skipped N
without EUID, with N at least one. The pipeline does not produce any
observations for that entity, and it does not error out.
### If you only want to do a code level review
The four spots worth focusing on:
1. observation_modules/utils.ts: entityToKey should return entity.id
only. No string concatenation.
2. The three observation modules
(behavioral_analysis_module/{data_access,module}.ts,
temporal_state_module.ts, risk_score_module.ts) should all use
entityToKey(entity) for in memory lookups. There should be no occurrence
of $\{type\}.name in any query body.
3. risk_score/risk_score_data_client.ts:
getDailyAverageRiskScoreNormSeries should filter on risk.id_field equal
to entity.id and aggregate by risk.id_value, and its returned map should
be keyed by the EUID directly (no type prefix on top of an already type
prefixed string).
4. lead_generation/entity_conversion.ts: entityRecordToLeadEntity
returns undefined when entity.id is missing, and fetchCandidateEntities
filters those out before returning the batch.1 parent 97cc238 commit 692454d
19 files changed
Lines changed: 642 additions & 138 deletions
File tree
- x-pack/solutions/security/plugins/security_solution/server/lib/entity_analytics
- lead_generation
- engine
- observation_modules
- behavioral_analysis_module
- risk_score
Lines changed: 10 additions & 6 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
17 | 17 | | |
18 | 18 | | |
19 | 19 | | |
20 | | - | |
21 | | - | |
22 | | - | |
23 | | - | |
24 | | - | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
25 | 29 | | |
26 | 30 | | |
27 | 31 | | |
28 | 32 | | |
29 | 33 | | |
30 | 34 | | |
31 | | - | |
| 35 | + | |
32 | 36 | | |
33 | 37 | | |
34 | 38 | | |
| |||
Lines changed: 2 additions & 3 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
16 | 16 | | |
17 | 17 | | |
18 | 18 | | |
19 | | - | |
20 | 19 | | |
21 | 20 | | |
22 | 21 | | |
| |||
177 | 176 | | |
178 | 177 | | |
179 | 178 | | |
180 | | - | |
| 179 | + | |
181 | 180 | | |
182 | 181 | | |
183 | 182 | | |
| |||
297 | 296 | | |
298 | 297 | | |
299 | 298 | | |
300 | | - | |
| 299 | + | |
301 | 300 | | |
302 | 301 | | |
303 | 302 | | |
| |||
Lines changed: 10 additions & 6 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
34 | 34 | | |
35 | 35 | | |
36 | 36 | | |
37 | | - | |
38 | | - | |
39 | | - | |
40 | | - | |
41 | | - | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
42 | 46 | | |
43 | 47 | | |
44 | 48 | | |
45 | 49 | | |
46 | 50 | | |
47 | | - | |
| 51 | + | |
48 | 52 | | |
49 | 53 | | |
50 | 54 | | |
| |||
Lines changed: 1 addition & 2 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
10 | 10 | | |
11 | 11 | | |
12 | 12 | | |
13 | | - | |
14 | 13 | | |
15 | 14 | | |
16 | 15 | | |
| |||
60 | 59 | | |
61 | 60 | | |
62 | 61 | | |
63 | | - | |
| 62 | + | |
64 | 63 | | |
65 | 64 | | |
66 | 65 | | |
| |||
Lines changed: 120 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
| 75 | + | |
| 76 | + | |
| 77 | + | |
| 78 | + | |
| 79 | + | |
| 80 | + | |
| 81 | + | |
| 82 | + | |
| 83 | + | |
| 84 | + | |
| 85 | + | |
| 86 | + | |
| 87 | + | |
| 88 | + | |
| 89 | + | |
| 90 | + | |
| 91 | + | |
| 92 | + | |
| 93 | + | |
| 94 | + | |
| 95 | + | |
| 96 | + | |
| 97 | + | |
| 98 | + | |
| 99 | + | |
| 100 | + | |
| 101 | + | |
| 102 | + | |
| 103 | + | |
| 104 | + | |
| 105 | + | |
| 106 | + | |
| 107 | + | |
| 108 | + | |
| 109 | + | |
| 110 | + | |
| 111 | + | |
| 112 | + | |
| 113 | + | |
| 114 | + | |
| 115 | + | |
| 116 | + | |
| 117 | + | |
| 118 | + | |
| 119 | + | |
| 120 | + | |
Lines changed: 18 additions & 8 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
18 | 18 | | |
19 | 19 | | |
20 | 20 | | |
21 | | - | |
22 | | - | |
23 | | - | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
24 | 26 | | |
25 | | - | |
| 27 | + | |
26 | 28 | | |
27 | 29 | | |
28 | 30 | | |
29 | 31 | | |
| 32 | + | |
| 33 | + | |
30 | 34 | | |
31 | 35 | | |
| 36 | + | |
32 | 37 | | |
33 | | - | |
| 38 | + | |
34 | 39 | | |
35 | 40 | | |
36 | 41 | | |
| |||
53 | 58 | | |
54 | 59 | | |
55 | 60 | | |
56 | | - | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
57 | 65 | | |
58 | 66 | | |
59 | | - | |
| 67 | + | |
60 | 68 | | |
61 | | - | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
62 | 72 | | |
63 | 73 | | |
64 | 74 | | |
| |||
0 commit comments