Skip to content

Commit 95f416c

Browse files
mergify[bot]ebarlasvishaangelova
authored
[Metricbeat] Add elasticsearch/security_stats metricset (#50674) (#50883)
Adds a per-node security_stats metricset that scrapes the new GET /_security/stats endpoint introduced in Elasticsearch 9.2. The first metric exposed is the Document Level Security cache (entries, memory, hits, misses, evictions, hit/miss latency), giving Stack Monitoring fleet-wide visibility into DLS cache health for spotting cache thrash, oversized working sets, and unhealthy hit/miss ratios. Each event is enriched with node name, roles, and stack version via a single filter-path-scoped /_nodes call per scrape, shared across all per-node events emitted in that scrape. This logic lives on the module's MetricSet as the new NodeEnrichment helper so future per-node metricsets can reuse it. node.version is also declared at the module level alongside id, name, roles, master, and mlockall. The shared metricbeat/docker-compose.yml elasticsearch service now runs with xpack.security.enabled=true plus an anonymous superuser, since /_security/stats is only registered when security is enabled. Anonymous superuser keeps the rest of the elasticsearch integration test suite working without threading credentials through every metricset's setup. * docs: register security_stats metricset page in toc.yml mage update regenerates per-metricset markdown but doesn't touch the navigation toc.yml. Add the missing entry so docs-build can locate the security_stats page in the Elasticsearch module section. * docs: replace "e.g." with "for example" per Vale style guide Elastic.Latinisms forbids Latin abbreviations in docs. Replace the lone "e.g." in the new node.version field description and regenerate the affected files. * metricbeat/elasticsearch: clean up pre-existing lint issues Two pre-existing lint findings in elasticsearch_integration_test.go became blocking once this branch touched the file (golangci-lint runs with --whole-files). Both fixes are mechanical: - Replace math/rand with math/rand/v2 in randString and drop the redundant per-call seeded local Rand. - Add the comma-ok form to the version.number type assertion in getElasticsearchVersion so errcheck (with check-type-assertions) is satisfied. * metricbeat/elasticsearch: dedupe node.version field declaration The new module-level node.version added for security_stats collided with a pre-existing node.version in the node metricset's local fields.yml, breaking `metricbeat export index-pattern` with "field <elasticsearch.node.version> is duplicated". Drop the metricset-local declaration in favor of the shared module-level one, which carries a richer description and is the right scope for a field emitted by multiple per-node metricsets. * metricbeat: provision file-realm users for secured ES test stack Enabling xpack.security on the shared elasticsearch service for security_stats coverage broke Kibana boot: Kibana 9.x's interactive setup plugin holds preboot when ES has security on without ELASTICSEARCH_USERNAME, and the existing Kibana healthchecks (curl -u beats:testing, curl -u myelastic:changeme) started actually validating against ES instead of being silently ignored. Provision the named users that the existing healthchecks expect via elasticsearch-users useradd in the startup command, and give Kibana real ES credentials. Anonymous=superuser is preserved so the integration tests' credential-less HTTP probes keep working without threading credentials through every metricset's setup. * x-pack/metricbeat: give kibana credentials to secured ES The previous commit enabled xpack.security on the shared Elasticsearch service and gave the OSS metricbeat kibana service real credentials, but x-pack/metricbeat hand-copies its kibana stanza (depends_on can't be extended) so the env didn't propagate. With no ELASTICSEARCH_USERNAME, Kibana entered interactive setup, the Dockerfile healthcheck (curl -u myelastic:changeme /api/stats) never reached green, and the proxy_dep busybox blocked all integration tests from starting. Mirror the env vars into the x-pack kibana stanza and note the duplication contract in a comment so future secured-ES changes are applied in both places. * metricbeat/elasticsearch: gate security_stats on xpack feature flag CI exposed that the previous PR commits enabled xpack.security on the shared metricbeat docker-compose stack to exercise /_security/stats, but that change rippled wider than fits in this PR: Kibana boot, healthcheck users, OTel test framework default credentials, and the Python `get_version` helper all assume an open ES. Revert both metricbeat and x-pack/metricbeat docker-compose.yml to their upstream/main shape and address the underlying problem in the metricset itself. `security_stats.checkAvailability` now mirrors the pattern used by ccr and ml_job: a free in-memory version comparison short-circuits old clusters first, then a proactive `GET /_xpack` probe checks `features.security.enabled` so we can emit a specific operator-facing log message and avoid hitting an endpoint we know would return 400. A new `Security` field is added to the shared `elasticsearch.XPack` struct to support the check. The elasticsearch_integration_test.go suite skips security_stats unconditionally for now, with a TODO pointing at a focused follow-up PR that migrates the metricbeat compose stack to an x-pack-security- enabled posture (file-realm users, Kibana credentials, test fixture auth). At that point the skip becomes vacuous and the metricset is exercised against a real /_security/stats response. --------- (cherry picked from commit 7119b64) Co-authored-by: Elliot Barlas <elliotbarlas@gmail.com> Co-authored-by: Visha Angelova <91186315+vishaangelova@users.noreply.github.com>
1 parent b1557cf commit 95f416c

23 files changed

Lines changed: 855 additions & 22 deletions
Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,12 @@
1+
kind: feature
2+
summary: Add `elasticsearch/security_stats` metricset for the Elasticsearch module
3+
description: |
4+
Adds a new `security_stats` metricset to the Elasticsearch module that
5+
scrapes the per-node `GET /_security/stats` endpoint (available since
6+
Elasticsearch 9.2). The first metric exposed is the Document Level Security
7+
cache (entries, memory, hits, misses, evictions, hit/miss latency), enabling
8+
fleet-wide observability of DLS cache health from Stack Monitoring. Each
9+
event is enriched with node name, roles, and version via a single
10+
filter-path-scoped /_nodes call per scrape so consumers can slice by node,
11+
role, or stack version without joining across data streams.
12+
component: metricbeat

docs/reference/metricbeat/exported-fields-elasticsearch.md

Lines changed: 66 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -982,6 +982,12 @@ Elasticsearch module
982982
type: boolean
983983

984984

985+
**`elasticsearch.node.version`**
986+
: Elasticsearch version reported by the node (for example, `9.2.0`).
987+
988+
type: keyword
989+
990+
985991
## ccr [_ccr]
986992

987993
Cross-cluster replication stats
@@ -2096,12 +2102,6 @@ ml
20962102

20972103
node
20982104

2099-
**`elasticsearch.node.version`**
2100-
: Node version.
2101-
2102-
type: keyword
2103-
2104-
21052105
## jvm [_jvm]
21062106

21072107
JVM Info.
@@ -2874,6 +2874,66 @@ File system summary
28742874
type: long
28752875

28762876

2877+
## security.stats [_security.stats]
2878+
2879+
```{applies_to}
2880+
stack: ga 9.2.0
2881+
```
2882+
2883+
Per-node security statistics collected from the Elasticsearch Security Stats API (`GET /_security/stats`). Available since Elasticsearch 9.2.
2884+
2885+
## dls [_dls]
2886+
2887+
Document Level Security (DLS) counters.
2888+
2889+
## cache [_cache]
2890+
2891+
Per-node cache used to materialize the bitsets that enforce DLS queries. Sourced from the `roles.dls.bit_set_cache` block in the `GET /_security/stats` response.
2892+
2893+
**`elasticsearch.security.stats.dls.cache.entries.count`**
2894+
: Current number of cached entries on this node.
2895+
2896+
type: long
2897+
2898+
2899+
**`elasticsearch.security.stats.dls.cache.memory.bytes`**
2900+
: Current memory consumed by the DLS cache on this node, in bytes.
2901+
2902+
type: long
2903+
2904+
format: bytes
2905+
2906+
2907+
**`elasticsearch.security.stats.dls.cache.hits.count`**
2908+
: Number of cache lookups served from the cache since node startup.
2909+
2910+
type: long
2911+
2912+
2913+
**`elasticsearch.security.stats.dls.cache.misses.count`**
2914+
: Number of cache lookups that had to materialize an entry since node startup.
2915+
2916+
type: long
2917+
2918+
2919+
**`elasticsearch.security.stats.dls.cache.evictions.count`**
2920+
: Number of entries evicted (due to size limit or expiration) since node startup.
2921+
2922+
type: long
2923+
2924+
2925+
**`elasticsearch.security.stats.dls.cache.hits.time.ms`**
2926+
: Cumulative time spent serving cache hits since node startup, in milliseconds.
2927+
2928+
type: long
2929+
2930+
2931+
**`elasticsearch.security.stats.dls.cache.misses.time.ms`**
2932+
: Cumulative time spent materializing entries on cache misses since node startup, in milliseconds.
2933+
2934+
type: long
2935+
2936+
28772937
## shard [_shard]
28782938

28792939
shard fields
Lines changed: 85 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,85 @@
1+
---
2+
mapped_pages:
3+
- https://www.elastic.co/guide/en/beats/metricbeat/current/metricbeat-metricset-elasticsearch-security_stats.html
4+
applies_to:
5+
stack: ga 9.2.0
6+
serverless: ga
7+
---
8+
9+
% This file is generated! See metricbeat/scripts/mage/docs_collector.go
10+
11+
# Elasticsearch security_stats metricset [metricbeat-metricset-elasticsearch-security_stats]
12+
13+
This is the `security_stats` metricset of the Elasticsearch module. It queries the Security Stats API endpoint (`GET /_security/stats`, available since Elasticsearch 9.2) to collect per-node security counters. The endpoint exposes Document Level Security (DLS) cache statistics, which are useful for spotting cache thrash, oversized working sets, and unhealthy hit/miss ratios across a fleet.
14+
15+
Each emitted event is enriched with `node.{name,roles,version}` (alongside `node.id`) via a single side-channel `/_nodes` call per scrape, so consumers can slice by node, role, or stack version without joining across data streams.
16+
17+
The `/_security/stats` endpoint is only served when the Elasticsearch security feature is enabled (`xpack.security.enabled: true`). The metricset checks `GET /_xpack` on each scrape. When security is disabled, it emits a throttled debug log, but no events.
18+
19+
Authorization follows the same model as `/_cluster/stats` and `/_nodes/stats`: the caller needs the `monitor` cluster privilege.
20+
21+
## Fields [_fields]
22+
23+
For a description of each field in the metricset, see the [exported fields](/reference/metricbeat/exported-fields-elasticsearch.md) section.
24+
25+
Here is an example document generated by this metricset:
26+
27+
```json
28+
{
29+
"@timestamp": "2026-04-27T20:00:00.000Z",
30+
"elasticsearch": {
31+
"cluster": {
32+
"id": "WocBBA0QRma0sGpdQ7vLfQ",
33+
"name": "docker-cluster"
34+
},
35+
"node": {
36+
"id": "f5i3v9hMT_q__q6B9WOo5A",
37+
"name": "instance-0000000019",
38+
"roles": ["data_hot", "ingest"],
39+
"version": "9.2.0"
40+
},
41+
"security": {
42+
"stats": {
43+
"dls": {
44+
"cache": {
45+
"entries": {
46+
"count": 12
47+
},
48+
"memory": {
49+
"bytes": 4096
50+
},
51+
"hits": {
52+
"count": 8421,
53+
"time": {
54+
"ms": 51
55+
}
56+
},
57+
"misses": {
58+
"count": 137,
59+
"time": {
60+
"ms": 219
61+
}
62+
},
63+
"evictions": {
64+
"count": 4
65+
}
66+
}
67+
}
68+
}
69+
}
70+
},
71+
"event": {
72+
"dataset": "elasticsearch.security.stats",
73+
"duration": 115000,
74+
"module": "elasticsearch"
75+
},
76+
"metricset": {
77+
"name": "security_stats",
78+
"period": 10000
79+
},
80+
"service": {
81+
"address": "172.19.0.2:9200",
82+
"type": "elasticsearch"
83+
}
84+
}
85+
```

docs/reference/metricbeat/metricbeat-module-elasticsearch.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -112,4 +112,5 @@ The following metricsets are available:
112112
* [node](/reference/metricbeat/metricbeat-metricset-elasticsearch-node.md)
113113
* [node_stats](/reference/metricbeat/metricbeat-metricset-elasticsearch-node_stats.md)
114114
* [pending_tasks](/reference/metricbeat/metricbeat-metricset-elasticsearch-pending_tasks.md)
115+
* [security_stats](/reference/metricbeat/metricbeat-metricset-elasticsearch-security_stats.md) {applies_to}`stack: ga 9.2.0`
115116
* [shard](/reference/metricbeat/metricbeat-metricset-elasticsearch-shard.md)

docs/reference/metricbeat/metricbeat-modules.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -33,7 +33,7 @@ This section contains detailed information about the metric collecting modules c
3333
| [CouchDB](/reference/metricbeat/metricbeat-module-couchdb.md) | ![Prebuilt dashboards are available](images/icon-yes.png "") | [server](/reference/metricbeat/metricbeat-metricset-couchdb-server.md) |
3434
| [Docker](/reference/metricbeat/metricbeat-module-docker.md) | ![Prebuilt dashboards are available](images/icon-yes.png "") | [container](/reference/metricbeat/metricbeat-metricset-docker-container.md)<br>[cpu](/reference/metricbeat/metricbeat-metricset-docker-cpu.md)<br>[diskio](/reference/metricbeat/metricbeat-metricset-docker-diskio.md)<br>[event](/reference/metricbeat/metricbeat-metricset-docker-event.md)<br>[healthcheck](/reference/metricbeat/metricbeat-metricset-docker-healthcheck.md)<br>[image](/reference/metricbeat/metricbeat-metricset-docker-image.md)<br>[info](/reference/metricbeat/metricbeat-metricset-docker-info.md)<br>[memory](/reference/metricbeat/metricbeat-metricset-docker-memory.md)<br>[network](/reference/metricbeat/metricbeat-metricset-docker-network.md)<br>[network_summary](/reference/metricbeat/metricbeat-metricset-docker-network_summary.md) {applies_to}`stack: beta` |
3535
| [Dropwizard](/reference/metricbeat/metricbeat-module-dropwizard.md) | ![No prebuilt dashboards](images/icon-no.png "") | [collector](/reference/metricbeat/metricbeat-metricset-dropwizard-collector.md) |
36-
| [Elasticsearch](/reference/metricbeat/metricbeat-module-elasticsearch.md) | ![No prebuilt dashboards](images/icon-no.png "") | [ccr](/reference/metricbeat/metricbeat-metricset-elasticsearch-ccr.md)<br>[cluster_stats](/reference/metricbeat/metricbeat-metricset-elasticsearch-cluster_stats.md)<br>[enrich](/reference/metricbeat/metricbeat-metricset-elasticsearch-enrich.md)<br>[index](/reference/metricbeat/metricbeat-metricset-elasticsearch-index.md)<br>[index_recovery](/reference/metricbeat/metricbeat-metricset-elasticsearch-index_recovery.md)<br>[index_summary](/reference/metricbeat/metricbeat-metricset-elasticsearch-index_summary.md)<br>[ingest_pipeline](/reference/metricbeat/metricbeat-metricset-elasticsearch-ingest_pipeline.md) {applies_to}`stack: beta`<br>[ml_job](/reference/metricbeat/metricbeat-metricset-elasticsearch-ml_job.md)<br>[node](/reference/metricbeat/metricbeat-metricset-elasticsearch-node.md)<br>[node_stats](/reference/metricbeat/metricbeat-metricset-elasticsearch-node_stats.md)<br>[pending_tasks](/reference/metricbeat/metricbeat-metricset-elasticsearch-pending_tasks.md)<br>[shard](/reference/metricbeat/metricbeat-metricset-elasticsearch-shard.md) |
36+
| [Elasticsearch](/reference/metricbeat/metricbeat-module-elasticsearch.md) | ![No prebuilt dashboards](images/icon-no.png "") | [ccr](/reference/metricbeat/metricbeat-metricset-elasticsearch-ccr.md)<br>[cluster_stats](/reference/metricbeat/metricbeat-metricset-elasticsearch-cluster_stats.md)<br>[enrich](/reference/metricbeat/metricbeat-metricset-elasticsearch-enrich.md)<br>[index](/reference/metricbeat/metricbeat-metricset-elasticsearch-index.md)<br>[index_recovery](/reference/metricbeat/metricbeat-metricset-elasticsearch-index_recovery.md)<br>[index_summary](/reference/metricbeat/metricbeat-metricset-elasticsearch-index_summary.md)<br>[ingest_pipeline](/reference/metricbeat/metricbeat-metricset-elasticsearch-ingest_pipeline.md) {applies_to}`stack: beta`<br>[ml_job](/reference/metricbeat/metricbeat-metricset-elasticsearch-ml_job.md)<br>[node](/reference/metricbeat/metricbeat-metricset-elasticsearch-node.md)<br>[node_stats](/reference/metricbeat/metricbeat-metricset-elasticsearch-node_stats.md)<br>[pending_tasks](/reference/metricbeat/metricbeat-metricset-elasticsearch-pending_tasks.md)<br>[security_stats](/reference/metricbeat/metricbeat-metricset-elasticsearch-security_stats.md) {applies_to}`stack: ga 9.2.0`<br>[shard](/reference/metricbeat/metricbeat-metricset-elasticsearch-shard.md) |
3737
| [Envoyproxy](/reference/metricbeat/metricbeat-module-envoyproxy.md) | ![No prebuilt dashboards](images/icon-no.png "") | [server](/reference/metricbeat/metricbeat-metricset-envoyproxy-server.md) |
3838
| [Etcd](/reference/metricbeat/metricbeat-module-etcd.md) | ![No prebuilt dashboards](images/icon-no.png "") | [leader](/reference/metricbeat/metricbeat-metricset-etcd-leader.md)<br>[metrics](/reference/metricbeat/metricbeat-metricset-etcd-metrics.md) {applies_to}`stack: beta`<br>[self](/reference/metricbeat/metricbeat-metricset-etcd-self.md)<br>[store](/reference/metricbeat/metricbeat-metricset-etcd-store.md) |
3939
| [Google Cloud Platform](/reference/metricbeat/metricbeat-module-gcp.md) | ![Prebuilt dashboards are available](images/icon-yes.png "") | [billing](/reference/metricbeat/metricbeat-metricset-gcp-billing.md)<br>[carbon](/reference/metricbeat/metricbeat-metricset-gcp-carbon.md) {applies_to}`stack: beta`<br>[compute](/reference/metricbeat/metricbeat-metricset-gcp-compute.md)<br>[dataproc](/reference/metricbeat/metricbeat-metricset-gcp-dataproc.md)<br>[firestore](/reference/metricbeat/metricbeat-metricset-gcp-firestore.md)<br>[gke](/reference/metricbeat/metricbeat-metricset-gcp-gke.md)<br>[loadbalancing](/reference/metricbeat/metricbeat-metricset-gcp-loadbalancing.md)<br>[metrics](/reference/metricbeat/metricbeat-metricset-gcp-metrics.md)<br>[pubsub](/reference/metricbeat/metricbeat-metricset-gcp-pubsub.md)<br>[storage](/reference/metricbeat/metricbeat-metricset-gcp-storage.md)<br>[vertexai_logs](/reference/metricbeat/metricbeat-metricset-gcp-vertexai_logs.md) {applies_to}`stack: beta 9.2.0` |

docs/reference/toc.yml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -935,6 +935,7 @@ toc:
935935
- file: metricbeat/metricbeat-metricset-elasticsearch-node.md
936936
- file: metricbeat/metricbeat-metricset-elasticsearch-node_stats.md
937937
- file: metricbeat/metricbeat-metricset-elasticsearch-pending_tasks.md
938+
- file: metricbeat/metricbeat-metricset-elasticsearch-security_stats.md
938939
- file: metricbeat/metricbeat-metricset-elasticsearch-shard.md
939940
- file: metricbeat/metricbeat-module-envoyproxy.md
940941
children:

metricbeat/include/list_common.go

Lines changed: 1 addition & 0 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

metricbeat/module/elasticsearch/_meta/config-xpack.yml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -11,6 +11,7 @@
1111
# - node
1212
# - node_stats
1313
# - pending_tasks
14+
# - security_stats
1415
# - shard
1516
xpack.enabled: true
1617
period: 10s

metricbeat/module/elasticsearch/_meta/fields.yml

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -693,3 +693,8 @@
693693
type: boolean
694694
description: >
695695
Is mlockall enabled on the node?
696+
- name: version
697+
type: keyword
698+
description: >
699+
Elasticsearch version reported by the node (for example,
700+
`9.2.0`).

metricbeat/module/elasticsearch/elasticsearch.go

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -50,6 +50,7 @@ func NewModule(base mb.BaseModule) (mb.Module, error) {
5050
"index_summary",
5151
"ml_job",
5252
"node_stats",
53+
"security_stats",
5354
"shard",
5455
}
5556
optionalXpackMetricsets := []string{"ingest_pipeline"}
@@ -63,6 +64,10 @@ var (
6364
// EnrichStatsAPIAvailableVersion is the version of Elasticsearch since when the Enrich stats API is available.
6465
EnrichStatsAPIAvailableVersion = version.MustNew("7.5.0")
6566

67+
// SecurityStatsAPIAvailableVersion is the version of Elasticsearch since when the Security Stats API
68+
// (GET /_security/stats) is available.
69+
SecurityStatsAPIAvailableVersion = version.MustNew("9.2.0")
70+
6671
// BulkStatsAvailableVersion is the version since when bulk indexing stats are available
6772
BulkStatsAvailableVersion = version.MustNew("8.0.0")
6873

@@ -333,6 +338,9 @@ type XPack struct {
333338
ML struct {
334339
Enabled bool `json:"enabled"`
335340
} `json:"ml"`
341+
Security struct {
342+
Enabled bool `json:"enabled"`
343+
} `json:"security"`
336344
} `json:"features"`
337345
}
338346

0 commit comments

Comments
 (0)