Skip to content

Commit c8f2703

Browse files
(Doc+) Flush out Slow Logs (#118518)
* (Doc+) Slow Logs --------- Co-authored-by: shainaraskas <[email protected]>
1 parent b790628 commit c8f2703

File tree

3 files changed

+180
-100
lines changed

3 files changed

+180
-100
lines changed

docs/reference/api-conventions.asciidoc

+2-2
Original file line numberDiff line numberDiff line change
@@ -28,7 +28,7 @@ You can pass an `X-Opaque-Id` HTTP header to track the origin of a request in
2828

2929
* Response of any request that includes the header
3030
* <<_identifying_running_tasks,Task management API>> response
31-
* <<_identifying_search_slow_log_origin,Slow logs>>
31+
* <<search-slow-log,Slow logs>>
3232
* <<deprecation-logging,Deprecation logs>>
3333

3434
For the deprecation logs, {es} also uses the `X-Opaque-Id` value to throttle
@@ -52,7 +52,7 @@ safely generate a unique `traceparent` header for each request.
5252
If provided, {es} surfaces the header's `trace-id` value as `trace.id` in the:
5353

5454
* <<logging,JSON {es} server logs>>
55-
* <<_identifying_search_slow_log_origin,Slow logs>>
55+
* <<search-slow-log,Slow logs>>
5656
* <<deprecation-logging,Deprecation logs>>
5757

5858
For example, the following `traceparent` value would produce the following

docs/reference/how-to/search-speed.asciidoc

+5
Original file line numberDiff line numberDiff line change
@@ -567,3 +567,8 @@ also possible to update the client-side logic in order to route queries to the
567567
relevant indices based on filters. However `constant_keyword` makes it
568568
transparently and allows to decouple search requests from the index topology in
569569
exchange of very little overhead.
570+
571+
[discrete]
572+
=== Default search timeout
573+
574+
By default, search requests don't time out. You can set a timeout using the <<search-timeout,`search.default_search_timeout`>> setting.
+173-98
Original file line numberDiff line numberDiff line change
@@ -1,15 +1,118 @@
11
[[index-modules-slowlog]]
2-
== Slow Log
2+
== Slow log
3+
4+
The slow log records database searching and indexing events that have execution durations above specified thresholds. You can use these logs to investigate analyze or troubleshoot your cluster's historical search and indexing performance.
5+
6+
Slow logs report task duration at the shard level for searches, and at the index level
7+
for indexing, but might not encompass the full task execution time observed on the client. For example, slow logs don't surface HTTP network delays or the impact of <<task-queue-backlog,task queues>>.
8+
9+
Events that meet the specified threshold are emitted into <<logging,{es} logging>> under the `fileset.name` of `slowlog`. These logs can be viewed in the following locations:
10+
11+
* If <<monitoring-overview,{es} monitoring>> is enabled, from
12+
{kibana-ref}/xpack-monitoring.html[Stack Monitoring]. Slow log events have a `logger` value of `index.search.slowlog` or `index.indexing.slowlog`.
13+
14+
* From local {es} service logs directory. Slow log files have a suffix of `_index_search_slowlog.json` or `_index_indexing_slowlog.json`.
15+
16+
[discrete]
17+
[[slow-log-format]]
18+
=== Slow log format
19+
20+
The following is an example of a search event in the slow log:
21+
22+
TIP: If a call was initiated with an `X-Opaque-ID` header, then the ID is automatically included in Search slow logs in the **elasticsearch.slowlog.id** field. See <<x-opaque-id,X-Opaque-Id HTTP header>> for details and best practices.
23+
24+
[source,js]
25+
---------------------------
26+
{
27+
"@timestamp": "2024-12-21T12:42:37.255Z",
28+
"auth.type": "REALM",
29+
"ecs.version": "1.2.0",
30+
"elasticsearch.cluster.name": "distribution_run",
31+
"elasticsearch.cluster.uuid": "Ui23kfF1SHKJwu_hI1iPPQ",
32+
"elasticsearch.node.id": "JK-jn-XpQ3OsDUsq5ZtfGg",
33+
"elasticsearch.node.name": "node-0",
34+
"elasticsearch.slowlog.id": "tomcat-123",
35+
"elasticsearch.slowlog.message": "[index6][0]",
36+
"elasticsearch.slowlog.search_type": "QUERY_THEN_FETCH",
37+
"elasticsearch.slowlog.source": "{\"query\":{\"match_all\":{\"boost\":1.0}}}",
38+
"elasticsearch.slowlog.stats": "[]",
39+
"elasticsearch.slowlog.took": "747.3micros",
40+
"elasticsearch.slowlog.took_millis": 0,
41+
"elasticsearch.slowlog.total_hits": "1 hits",
42+
"elasticsearch.slowlog.total_shards": 1,
43+
"event.dataset": "elasticsearch.index_search_slowlog",
44+
"fileset.name" : "slowlog",
45+
"log.level": "WARN",
46+
"log.logger": "index.search.slowlog.query",
47+
"process.thread.name": "elasticsearch[runTask-0][search][T#5]",
48+
"service.name": "ES_ECS",
49+
"user.name": "elastic",
50+
"user.realm": "reserved"
51+
}
52+
53+
---------------------------
54+
// NOTCONSOLE
55+
56+
57+
The following is an example of an indexing event in the slow log:
58+
59+
[source,js]
60+
---------------------------
61+
{
62+
"@timestamp" : "2024-12-11T22:34:22.613Z",
63+
"auth.type": "REALM",
64+
"ecs.version": "1.2.0",
65+
"elasticsearch.cluster.name" : "41bd111609d849fc9bf9d25b5df9ce96",
66+
"elasticsearch.cluster.uuid" : "BZTn4I9URXSK26imlia0QA",
67+
"elasticsearch.index.id" : "3VfGR7wRRRKmMCEn7Ii58g",
68+
"elasticsearch.index.name": "my-index-000001",
69+
"elasticsearch.node.id" : "GGiBgg21S3eqPDHzQiCMvQ",
70+
"elasticsearch.node.name" : "instance-0000000001",
71+
"elasticsearch.slowlog.id" : "RCHbt5MBT0oSsCOu54AJ",
72+
"elasticsearch.slowlog.source": "{\"key\":\"value\"}"
73+
"elasticsearch.slowlog.took" : "0.01ms",
74+
"event.dataset": "elasticsearch.index_indexing_slowlog",
75+
"fileset.name" : "slowlog",
76+
"log.level" : "TRACE",
77+
"log.logger" : "index.indexing.slowlog.index",
78+
"service.name" : "ES_ECS",
79+
"user.name": "elastic",
80+
"user.realm": "reserved"
81+
}
82+
83+
---------------------------
84+
// NOTCONSOLE
85+
86+
[discrete]
87+
[[enable-slow-log]]
88+
=== Enable slow logging
89+
90+
You can enable slow logging at two levels:
91+
92+
* For all indices under the <<settings,{es} `log4j2.properties` configuration file>>. This method requires a node restart.
93+
* At the index level, using the <<indices-update-settings,update indices settings API>>
94+
95+
By default, all thresholds are set to `-1`, which results in no events being logged.
96+
97+
Slow log thresholds can be enabled for the four logging levels: `trace`, `debug`, `info`, and `warn`. You can mimic setting log level thresholds by disabling more verbose levels.
98+
99+
To view the current slow log settings, use the <<indices-get-settings,get index settings API>>:
100+
101+
[source,console]
102+
--------------------------------------------------
103+
GET _all/_settings?expand_wildcards=all&filter_path=*.settings.index.*.slowlog
104+
--------------------------------------------------
3105

4106
[discrete]
5107
[[search-slow-log]]
6-
=== Search Slow Log
108+
==== Enable slow logging for search events
109+
110+
Search slow logs emit per shard. They must be enabled separately for the shard's link:https://www.elastic.co/blog/understanding-query-then-fetch-vs-dfs-query-then-fetch[query and fetch search phases].
7111

8-
Shard level slow search log allows to log slow search (query and fetch
9-
phases) into a dedicated log file.
112+
You can use the `index.search.slowlog.include.user` setting to append `user.*` and `auth.type` fields to slow log entries. These fields contain information about the user who triggered the request.
10113

11-
Thresholds can be set for both the query phase of the execution, and
12-
fetch phase, here is a sample:
114+
The following snippet adjusts all available search slow log settings across all indices using the
115+
<<settings,`log4j2.properties` configuration file>>:
13116

14117
[source,yaml]
15118
--------------------------------------------------
@@ -22,10 +125,11 @@ index.search.slowlog.threshold.fetch.warn: 1s
22125
index.search.slowlog.threshold.fetch.info: 800ms
23126
index.search.slowlog.threshold.fetch.debug: 500ms
24127
index.search.slowlog.threshold.fetch.trace: 200ms
128+
129+
index.search.slowlog.include.user: true
25130
--------------------------------------------------
26131

27-
All of the above settings are _dynamic_ and can be set for each index using the
28-
<<indices-update-settings, update indices settings>> API. For example:
132+
The following snippet adjusts the same settings for a single index using the <<indices-update-settings,update indices settings API>>:
29133

30134
[source,console]
31135
--------------------------------------------------
@@ -38,138 +142,109 @@ PUT /my-index-000001/_settings
38142
"index.search.slowlog.threshold.fetch.warn": "1s",
39143
"index.search.slowlog.threshold.fetch.info": "800ms",
40144
"index.search.slowlog.threshold.fetch.debug": "500ms",
41-
"index.search.slowlog.threshold.fetch.trace": "200ms"
145+
"index.search.slowlog.threshold.fetch.trace": "200ms",
146+
"index.search.slowlog.include.user": true
42147
}
43148
--------------------------------------------------
44149
// TEST[setup:my_index]
45150

46-
By default thresholds are disabled (set to `-1`).
47151

48-
The logging is done on the shard level scope, meaning the execution of a
49-
search request within a specific shard. It does not encompass the whole
50-
search request, which can be broadcast to several shards in order to
51-
execute. Some of the benefits of shard level logging is the association
52-
of the actual execution on the specific machine, compared with request
53-
level.
152+
[discrete]
153+
[[index-slow-log]]
154+
==== Enable slow logging for indexing events
54155

156+
Indexing slow logs emit per index document.
55157

56-
The search slow log file is configured in the `log4j2.properties` file.
158+
You can use the `index.indexing.slowlog.include.user` setting to append `user.*` and `auth.type` fields to slow log entries. These fields contain information about the user who triggered the request.
57159

58-
[discrete]
59-
==== Identifying search slow log origin
160+
The following snippet adjusts all available indexing slow log settings across all indices using the
161+
<<settings,`log4j2.properties` configuration file>>:
60162

61-
It is often useful to identify what triggered a slow running query.
62-
To include information about the user that triggered a slow search,
63-
use the `index.search.slowlog.include.user` setting.
163+
[source,yaml]
164+
--------------------------------------------------
165+
index.indexing.slowlog.threshold.index.warn: 10s
166+
index.indexing.slowlog.threshold.index.info: 5s
167+
index.indexing.slowlog.threshold.index.debug: 2s
168+
index.indexing.slowlog.threshold.index.trace: 500ms
169+
170+
index.indexing.slowlog.source: 1000
171+
index.indexing.slowlog.reformat: true
172+
173+
index.indexing.slowlog.include.user: true
174+
--------------------------------------------------
175+
176+
177+
The following snippet adjusts the same settings for a single index using the <<indices-update-settings,update indices settings API>>:
64178

65179
[source,console]
66180
--------------------------------------------------
67181
PUT /my-index-000001/_settings
68182
{
69-
"index.search.slowlog.include.user": true
183+
"index.indexing.slowlog.threshold.index.warn": "10s",
184+
"index.indexing.slowlog.threshold.index.info": "5s",
185+
"index.indexing.slowlog.threshold.index.debug": "2s",
186+
"index.indexing.slowlog.threshold.index.trace": "500ms",
187+
"index.indexing.slowlog.source": "1000",
188+
"index.indexing.slowlog.reformat": true,
189+
"index.indexing.slowlog.include.user": true
70190
}
71191
--------------------------------------------------
72192
// TEST[setup:my_index]
73193

74-
This will result in user information being included in the slow log.
194+
[discrete]
195+
===== Logging the `_source` field
75196

76-
[source,js]
77-
---------------------------
78-
{
79-
"@timestamp": "2024-02-21T12:42:37.255Z",
80-
"log.level": "WARN",
81-
"auth.type": "REALM",
82-
"elasticsearch.slowlog.id": "tomcat-123",
83-
"elasticsearch.slowlog.message": "[index6][0]",
84-
"elasticsearch.slowlog.search_type": "QUERY_THEN_FETCH",
85-
"elasticsearch.slowlog.source": "{\"query\":{\"match_all\":{\"boost\":1.0}}}",
86-
"elasticsearch.slowlog.stats": "[]",
87-
"elasticsearch.slowlog.took": "747.3micros",
88-
"elasticsearch.slowlog.took_millis": 0,
89-
"elasticsearch.slowlog.total_hits": "1 hits",
90-
"elasticsearch.slowlog.total_shards": 1,
91-
"user.name": "elastic",
92-
"user.realm": "reserved",
93-
"ecs.version": "1.2.0",
94-
"service.name": "ES_ECS",
95-
"event.dataset": "elasticsearch.index_search_slowlog",
96-
"process.thread.name": "elasticsearch[runTask-0][search][T#5]",
97-
"log.logger": "index.search.slowlog.query",
98-
"elasticsearch.cluster.uuid": "Ui23kfF1SHKJwu_hI1iPPQ",
99-
"elasticsearch.node.id": "JK-jn-XpQ3OsDUsq5ZtfGg",
100-
"elasticsearch.node.name": "node-0",
101-
"elasticsearch.cluster.name": "distribution_run"
102-
}
197+
By default, {es} logs the first 1000 characters of the `_source` in the slow log. You can adjust how `_source` is logged using the `index.indexing.slowlog.source` setting. Set `index.indexing.slowlog.source` to `false` or `0` to skip logging the source entirely. Set `index.indexing.slowlog.source` to `true` to log the entire source regardless of size.
103198

104-
---------------------------
105-
// NOTCONSOLE
199+
The original `_source` is reformatted by default to make sure that it fits on a single log line. If preserving the original document format is important, then you can turn off reformatting by setting `index.indexing.slowlog.reformat` to `false`. This causes source to be logged with the original formatting intact, potentially spanning multiple log lines.
106200

107-
If a call was initiated with an `X-Opaque-ID` header, then the ID is included
108-
in Search Slow logs in the **elasticsearch.slowlog.id** field. See
109-
<<x-opaque-id, X-Opaque-Id HTTP header>> for details and best practices.
201+
[discrete]
202+
[[slow-log-fields]]
110203

111204
[discrete]
112-
[[index-slow-log]]
113-
=== Index Slow log
205+
[[troubleshoot-slow-log]]
206+
=== Best practices for slow logging
114207

115-
The indexing slow log, similar in functionality to the search slow
116-
log. The log file name ends with `_index_indexing_slowlog.json`. Log and
117-
the thresholds are configured in the same way as the search slowlog.
118-
Index slowlog sample:
208+
Logging slow requests can be resource intensive to your {es} cluster depending on the qualifying traffic's volume. For example, emitted logs might increase the index disk usage of your <<monitoring-overview,{es} monitoring>> cluster. To reduce the impact of slow logs, consider the following:
119209

120-
[source,yaml]
121-
--------------------------------------------------
122-
index.indexing.slowlog.threshold.index.warn: 10s
123-
index.indexing.slowlog.threshold.index.info: 5s
124-
index.indexing.slowlog.threshold.index.debug: 2s
125-
index.indexing.slowlog.threshold.index.trace: 500ms
126-
index.indexing.slowlog.source: 1000
127-
--------------------------------------------------
210+
* Enable slow logs against specific indices rather than across all indices.
211+
* Set high thresholds to reduce the number of logged events.
212+
* Enable slow logs only when troubleshooting.
128213

129-
All of the above settings are _dynamic_ and can be set for each index using the
130-
<<indices-update-settings, update indices settings>> API. For example:
214+
If you aren't sure how to start investigating traffic issues, consider enabling the `warn` threshold with a high `30s` threshold at the index level using the <<indices-update-settings,update indices settings API>>:
131215

216+
* Enable for search requests:
217+
+
132218
[source,console]
133219
--------------------------------------------------
134-
PUT /my-index-000001/_settings
220+
PUT /*/_settings
135221
{
136-
"index.indexing.slowlog.threshold.index.warn": "10s",
137-
"index.indexing.slowlog.threshold.index.info": "5s",
138-
"index.indexing.slowlog.threshold.index.debug": "2s",
139-
"index.indexing.slowlog.threshold.index.trace": "500ms",
140-
"index.indexing.slowlog.source": "1000"
222+
"index.search.slowlog.include.user": true,
223+
"index.search.slowlog.threshold.fetch.warn": "30s",
224+
"index.search.slowlog.threshold.query.warn": "30s"
141225
}
142226
--------------------------------------------------
143227
// TEST[setup:my_index]
144228

145-
To include information about the user that triggered a slow indexing event,
146-
use the `index.indexing.slowlog.include.user` setting.
147-
229+
* Enable for indexing requests:
230+
+
148231
[source,console]
149232
--------------------------------------------------
150-
PUT /my-index-000001/_settings
233+
PUT /*/_settings
151234
{
152-
"index.indexing.slowlog.include.user": true
235+
"index.indexing.slowlog.include.user": true,
236+
"index.indexing.slowlog.threshold.index.warn": "30s"
153237
}
154238
--------------------------------------------------
155239
// TEST[setup:my_index]
156240

157-
By default Elasticsearch will log the first 1000 characters of the _source in
158-
the slowlog. You can change that with `index.indexing.slowlog.source`. Setting
159-
it to `false` or `0` will skip logging the source entirely, while setting it to
160-
`true` will log the entire source regardless of size. The original `_source` is
161-
reformatted by default to make sure that it fits on a single log line. If preserving
162-
the original document format is important, you can turn off reformatting by setting
163-
`index.indexing.slowlog.reformat` to `false`, which will cause the source to be
164-
logged "as is" and can potentially span multiple log lines.
241+
Slow log thresholds being met does not guarantee cluster performance issues. In the event that symptoms are noticed, slow logs can provide helpful data to diagnose upstream traffic patterns or sources to resolve client-side issues. For example, you can use data included in `X-Opaque-ID`, the `_source` request body, or `user.*` fields to identify the source of your issue. This is similar to troubleshooting <<task-queue-backlog,live expensive tasks>>.
242+
243+
If you're experiencing search performance issues, then you might also consider investigating searches flagged for their query durations using the <<search-profile,profile API>>. You can then use the profiled query to investigate optimization options using the link:{kibana-ref}/xpack-profiler.html[query profiler]. This type of investigation should usually take place in a non-production environment.
165244

166-
The index slow log file is configured in the `log4j2.properties` file.
245+
Slow logging checks each event against the reporting threshold when the event is complete. This means that it can't report if events trigger <<circuit-breaker-errors,circuit breaker errors>>. If suspect circuit breaker errors, then you should also consider enabling <<enable-audit-logging,audit logging>>, which logs events before they are executed.
167246

168247
[discrete]
169-
=== Slow log levels
248+
=== Learn more
170249

171-
You can mimic the search or indexing slow log level by setting appropriate
172-
threshold making "more verbose" loggers to be switched off.
173-
If for instance we want to simulate `index.indexing.slowlog.level: INFO`
174-
then all we need to do is to set
175-
`index.indexing.slowlog.threshold.index.debug` and `index.indexing.slowlog.threshold.index.trace` to `-1`.
250+
To learn about other ways to optimize your search and indexing requests, refer to <<tune-for-search-speed,tune for search speed>> and <<tune-for-indexing-speed,tune for indexing speed>>.

0 commit comments

Comments
 (0)