feat: recognize OpenTelemetry flat JSON logs out of the box#1395
feat: recognize OpenTelemetry flat JSON logs out of the box#1395merlimat wants to merge 5 commits into
Conversation
Add OTel field-name conventions to the default config so logs serialized via the OpenTelemetry Logs data model (both PascalCase OTLP-JSON and snake_case flattened form) render correctly without extra configuration: - time: adds ObservedTimestamp / observed_timestamp - message: adds Body / body - level: new SeverityText (TRACE/DEBUG/INFO/WARN/ERROR/FATAL incl. numbered variants INFO2..FATAL4) and SeverityNumber (1..24) variants, with FATAL* mapped to `error` since hl has no fatal level Also ships etc/defaults/config-otel.toml as a strict OTel preset for users who want only OTel semantics (modeled after config-ecs.toml / config-k8s.toml). Refs: https://opentelemetry.io/docs/specs/otel/logs/data-model/ https://opentelemetry.io/docs/specs/otel/logs/data-model/#field-severitytext https://opentelemetry.io/docs/specs/otel/logs/data-model/#field-severitynumber
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #1395 +/- ##
=======================================
Coverage 90.05% 90.05%
=======================================
Files 72 72
Lines 12322 12322
=======================================
Hits 11096 11096
Misses 1226 1226 ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
|
Thanks for the well-researched PR, @merlimat! I appreciate the effort and the thorough survey of peer tools. I like the idea of having a preset configuration file for OTel ( However, I'm not very keen on adding all the OTel fields to the default config out of the box. The main concern is field name collisions with existing formats. For example, Instead of expanding the default config to cover every format, I think a better direction would be:
Would you be interested in contributing the preset part ( |
Per review feedback on pamburus#1395, drop the additions to the default config (OTel field names in `time.names`/`message.names`/level variants) and keep only the new `etc/defaults/config-otel.toml` preset. The broader default-config change risked collisions (e.g., `body` is commonly used for HTTP payloads in many log formats) and is better pursued later via a sub-format preset/auto-detection mechanism. Tests that exercised the now-removed defaults change are dropped; the preset is still validated by `config::tests::test_load_otel`.
|
@pamburus Thanks for the thoughtful feedback! I've force-reverted the default-config additions and this branch now contains only |
|
Thanks! I have a question about the field name choices in The preset comment says it covers "OTLP JSON (PascalCase) and snake_case conventions", but according to the OTLP spec, the actual OTLP/JSON wire format uses lowerCamelCase keys (per proto3 JSON mapping rules). So a compliant OTLP/JSON exporter would produce fields like Could you clarify what specific log sources or exporters you had in mind when adding the PascalCase and snake_case variants? For example:
It would help to either document the supported sources more precisely in the config file comment, or align the field names with actual OTLP/JSON output ( |
|
The spec you linked is the part regarding OTLP (the OTel collector) with protobuf & json format. It's used as a transport protocol. For applications logs format, here are: |
|
Thanks for the clarification — you're right that the log data model defines PascalCase field names (ObservedTimestamp, SeverityText, etc.), and those are distinct from the OTLP wire format. However, I still don't see where the snake_case variants ( Also, if we're covering multiple conventions, shouldn't we include the lowerCamelCase variants ( |
Drop the snake_case / lowercase alternates from the OTel preset, keeping only the canonical OTLP-JSON field names. hl matches severity text values case-insensitively, so `info`/`info2` etc. continue to recognize the spec-mandated `INFO`/`INFO2`/... forms automatically. Removed: - time.names: `timestamp`, `observed_timestamp` - logger.names: `scope.name` - message.names: `body` - SeverityText variant: `severity_text` - SeverityNumber variant: `severity_number` Kept (these are canonical OTel semconv attribute names, not snake_case alternates): `code.function`, `code.filepath`, `code.lineno`.
|
Good point. I've removed anything other than the PascalCase field. |
| time.names = ["Timestamp", "ObservedTimestamp"] | ||
| logger.names = ["InstrumentationScope.Name"] | ||
| message.names = ["Body"] | ||
| caller.names = ["code.function"] |
There was a problem hiding this comment.
It looks like otel does not provide any standard for the caller information.
So, using code.* field names in this configuration seems odd.
Wouldn't it be better to use TraceId in caller.names instead?
|
Could you please provide an example of a source of such logs? These examples do not match the configuration file, so I am confused. |
There are examples in the spec: |
|
Thanks for the pointer to OTEP 0097. I want to be upfront about a concern before I can move forward with this preset. The example records in OTEP 0097 § Example Log Records carry an explicit disclaimer immediately above them:
So that JSON shape is illustrative — it's documenting the data model, not a wire format. The actual encodings are defined elsewhere, and as far as I can tell none of them produce the flat PascalCase shape this preset targets:
Given that, I'd like to ask for one concrete thing before merging: a real producer that emits logs in the exact shape this preset matches — an SDK, logging library, exporter configuration, or framework you've actually seen output flat JSON with The reason I'm pressing on this: hl presets aren't just configuration, they're an implicit promise that "if you point hl at logs from $source, this preset will render them well." Without a real source, I can't review whether the field set, level mapping, and If the use case turns out to be "logs inspired by the OTel data model but not strictly OTLP" (e.g., an in-house logger someone wrote following the data model doc), that's fine — but then the preset's comment and filename should reflect that rather than implying OTLP/OTel-standard compatibility, so users aren't surprised when their actual OTLP/JSON logs don't match. Stepping back from the spec details for a moment — I'd like to understand the motivation behind this PR more concretely before going further. Could you share:
|
Summary
Adds an OpenTelemetry log preset (
etc/defaults/config-otel.toml) alongside the existingconfig-ecs.toml/config-k8s.toml, so users can render OTLP-JSON logs in human-readable form via--config etc/defaults/config-otel.toml(orHL_CONFIG=…).This is a scoped-down version of the original PR — see history below.
What's in the preset
Recognizes the canonical PascalCase OTel Logs Data Model field names (spec):
time:Timestamp,ObservedTimestampmessage:Bodylogger:InstrumentationScope.Namecaller/caller-file/caller-line:code.function/code.filepath/code.lineno(OTel semconv)SeverityText—TRACE/DEBUG/INFO/WARN/ERROR/FATALplus numbered formsINFO2,WARN3,FATAL4, … per § SeverityText. hl matches case-insensitively, so the uppercase spec forms are picked up automatically.SeverityNumber— 1–24 numeric mapping per § SeverityNumber.FATAL*maps toerrorbecause hl has nofatallevel.Example
Input:
{"Timestamp":"2026-04-16T10:15:30.123Z","SeverityText":"INFO","SeverityNumber":9,"Body":"server started","service.name":"checkout-api","http.port":8080} {"Timestamp":"2026-04-16T10:15:31.456Z","SeverityText":"WARN","SeverityNumber":13,"Body":"slow request","service.name":"checkout-api","duration_ms":1523} {"Timestamp":"2026-04-16T10:15:32.000Z","SeverityText":"FATAL","SeverityNumber":21,"Body":"unrecoverable","service.name":"checkout-api"} {"Timestamp":"2026-04-16T10:15:33.200Z","SeverityText":"INFO2","SeverityNumber":10,"Body":"heartbeat","service.name":"checkout-api"}Rendered with
--config etc/defaults/config-otel.toml:When both
SeverityTextandSeverityNumberare present,SeverityTextwins (matching the existing systemdPRIORITYbehavior) andSeverityNumberfalls through as a regular field. When onlySeverityNumberis present, it becomes the level.Scope changes from the original PR
The first commit on this branch also added the OTel field names to the default
etc/defaults/config.tomlso detection would be automatic. Per @pamburus's review, that was reverted to avoid collisions with non-OTel logs (e.g.,Bodyis generic,bodyis commonly an HTTP payload field). The default config now stays untouched; only the preset is added.A subsequent refactor also removed the
snake_casealternates (severity_text,body,observed_timestamp, …) from the preset, keeping only the canonical OTLP-JSON PascalCase form. hl's case-insensitive level matching meansseverity_textconsumers would need to switch toSeverityText(or roll their own preset).Test plan
cargo test—config::tests::test_load_otelvalidates the preset parses correctly and asserts the predefined field names and both level variants.cargo clippy --all-targets— cleancargo fmt --check— clean