You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
feat(elasticsearch): Make the provider compatible with JSON logging
Make the Elasticsearch log file provider compatible with the recently
introduced JSON logging. This simplifies using the provider in any
environment where the default dynamic mapping is enabled. If JSON
logging is not enabled, logstash must be configured to map the log lines
similar to that.
Also add support for a custom prefix for the ingested logs, which is
typically used to separate the application logs from other log entries
based on the environment. For the same reason the property that contains
the Kubernetes namespace is also made configurable.
The log lines are rendered as `<timestamp> <level> <message>` with the
optional throwable appended. This avoids cluttering them with redundant
information or technical details that are not relevant for users. The
schema can later be extended based on demand.
Signed-off-by: Martin Nonnenmacher <martin.nonnenmacher@doubleopen.io>
|`namespace`| filter |`keyword`| Exact-match deployment namespace filter; must match `elasticsearchNamespace`. The field name is configurable via `elasticsearchNamespaceField`. |
23
+
|`mdc.component`| filter (`mdc.component.keyword`) |`text` with `keyword` subfield | Exact-match ORT component filter using ORT Server component names. |
24
+
|`mdc.ortRunId`| filter (`mdc.ortRunId.keyword`) |`text` with `keyword` subfield | Exact-match ORT run ID filter. |
25
+
|`level`| filter (`level.keyword`) and `_source`|`text` with `keyword` subfield | Exact-match log level filter; the level value is also prepended to each written log line. |
26
+
|`timestamp`| range filter, primary sort, `_source`|`date` (`epoch_millis`) | Log event timestamp; used for range queries, primary sorting, and rendered as the leading timestamp of each log line. |
27
+
|`sequenceNumber`| secondary sort |`long`| Stable tie-breaker for `search_after` pagination among hits that share the same `timestamp`. |
28
+
|`formattedMessage`|`_source`|`text` recommended | Rendered log line written to the downloaded log file. |
29
+
|`throwable`|`_source`|`text` recommended | Optional rendered throwable; appended after the message when present. |
30
+
31
+
Filtering and sorting use the indexed (`keyword`) fields, while the log content is read from `_source`. The provider
32
+
therefore requests only `level`, `formattedMessage`, `throwable`, and `timestamp` in `_source` and uses the indexed
33
+
fields for the remaining filters and sorts. Each downloaded log line is rendered as `<timestamp> <level> <message>`,
34
+
with the `throwable` (if present) appended on the following line.
35
+
36
+
The `formattedMessage` field is written to the downloaded log file, one line per hit. It does not need to be indexed
37
+
for search by the provider, but it must be present in `_source`. Indexing `formattedMessage` as a `text` field is
38
+
recommended for Kibana, so users can search log lines, exceptions, request paths, and other free-form text. A
39
+
`.keyword` subfield can be useful for exact matches, sorting, or aggregations, but should usually have an
40
+
`ignore_above` limit to avoid indexing very large log lines as keyword terms.
41
+
42
+
Although `mdc.ortRunId` contains numeric values, the provider treats it as an identifier and queries its `keyword`
43
+
subfield. This matches Elasticsearch's guidance for numeric-looking identifiers that are primarily used in term
44
+
queries.
45
+
46
+
### Field prefix
47
+
48
+
If the indexing pipeline nests all log-line fields under a common prefix (for example, when Logstash is configured to
49
+
add a custom prefix to all fields during indexing), set `elasticsearchFieldPrefix`. The provider then prepends this
50
+
prefix to every field taken from the log line. The namespace field is *not* prefixed, because it is derived
51
+
from deployment metadata rather than from the log line. Prefixed fields are returned by Elasticsearch as nested
52
+
objects and resolved accordingly when reading values from `_source`.
|`elasticsearchServerUrl`|`ELASTICSEARCH_SERVER_URL`| Base URL of the Elasticsearch instance. | mandatory | no |
76
+
|`elasticsearchIndex`|`ELASTICSEARCH_INDEX`| Index or index pattern to query. | mandatory | no |
77
+
|`elasticsearchNamespace`|`ELASTICSEARCH_NAMESPACE`| Namespace label used to restrict queries. | mandatory | no |
78
+
|`elasticsearchNamespaceField`|`ELASTICSEARCH_NAMESPACE_FIELD`| Name of the field that holds the namespace value used by the namespace filter. |`namespace`| no |
79
+
|`elasticsearchFieldPrefix`|`ELASTICSEARCH_FIELD_PREFIX`| Optional prefix prepended to all log-line fields (everything except the namespace field). | undefined | no |
80
+
|`elasticsearchPageSize`|`ELASTICSEARCH_PAGE_SIZE`| Number of hits to fetch per search request. |`1000`| no |
81
+
|`elasticsearchUsername`|`ELASTICSEARCH_USERNAME`| Optional username for Basic Auth. Ignored when an API key is configured. | undefined | no |
0 commit comments