-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Otel trace groups #2
base: otel-trace-groups
Are you sure you want to change the base?
Otel trace groups #2
Conversation
Signed-off-by: Krishna Kondaka <[email protected]>
Provide dynamic templates for all attributes. Adds `ignore_above` to all key word fields. Use reasonable defaults for OTel data. Signed-off-by: Karsten Schnitter <[email protected]>
Provide dynamic templates for all attributes. Adds `ignore_above` to all key word fields. Use reasonable defaults for OTel data. Signed-off-by: Karsten Schnitter <[email protected]>
Provide dynamic templates for all attributes. Adds `ignore_above` to all key word fields. Use reasonable defaults for OTel data. Signed-off-by: Karsten Schnitter <[email protected]>
"dynamic_templates": [ | ||
{ | ||
"resource_attributes_map": { | ||
"integer_resource_attributes": { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be named as long_resource_attributes
and similarly below as well
"droppedEventsCount": { | ||
"type": "integer" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be droppedAttributesCount
based on the OTel proto: https://github.com/open-telemetry/opentelemetry-proto/blob/d7770822d70c7bd47a6891fc9faacc66fc4af3d3/opentelemetry/proto/common/v1/common.proto#L80?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Integer is at 32bits but signed. The proto definition specifies an unsigned 32bit integer. Do you think this difference is significant? I find it hard to imagine there would be a scenario where that number of attributes is ever attached and dropped to a log/metric/span.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are just referring to the naming, right? I am going to correct that.
@@ -6,58 +6,129 @@ | |||
"_source": { | |||
"enabled": true | |||
}, | |||
"dynamic_templates": [ | |||
{ | |||
"integer_resource_attributes": { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shouldn't this be long_resource_attributes
I think the question is already raised in other comments. can we address that in every signal type.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
WIll do.
"droppedEventsCount": { | ||
"type": "integer" | ||
} | ||
"droppedEventsCount": { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shouldn't this be droppedEventsCount
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am confused. What do want to change? This is the name of the field. Only indentation seems to be off.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am going to change this to droppedAttributesCount
.
"ignore_above": 256, | ||
"type": "keyword" | ||
"type": "keyword", | ||
"ignore_above": 16 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are we limiting to 16 chars here and 32 in traceId?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First, this is not a limitation of storage, just of the indexing. W3C Trace Context specifies the trace id to be an array of 16 bytes and the span id to be an array of 8 bytes. With hex-encoding this produces 32 and 16 characters respectively. Using B3 context would provide the same length. So it seems to be a reasonable choice.
} | ||
"serviceName": { | ||
"type": "alias", | ||
"path": "resource.attributes.service.name" | ||
}, | ||
"event": { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need event here?
https://github.com/open-telemetry/opentelemetry-proto/blob/main/opentelemetry/proto/logs/v1/logs.proto
We don't have anything in logs proto model.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is a question I am wondering myself. This comes out of the Simple Schema for Observability. I kept it, since it was in the original file proposed by @kkondaka. I know this is not in the OpenTelemetry standard. The same is true for the data_stream
. How do we want to go about this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will remove this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I take it out in my PR to your branch.
@@ -6,58 +6,129 @@ | |||
"_source": { | |||
"enabled": true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you also add eventName which is missing: https://github.com/open-telemetry/opentelemetry-proto/blob/main/opentelemetry/proto/logs/v1/logs.proto#L226
We have done some analysis on logs mapping here: opensearch-project/opensearch-catalog#217
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will add the field. I am going to take a look at the analysis.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@KarstenSchnitter It looks like we need to bump up the version number to 3 (at least for spans). Otherwise new template mappings are not being sent to the OpenSearch server.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where does this limitation come from?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have we added this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, this could look much nicer, if I could use component templates. That way, I could use the same dynamic mappings on all three templates. Is there a way to do that with Data Prepper?
}, | ||
"flags": { | ||
"type": "integer" | ||
"type": "long" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
https://github.com/open-telemetry/opentelemetry-proto/blob/main/opentelemetry/proto/logs/v1/logs.proto#L178
Can we also add dropped_attributes_count in log
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is defined as fixed32
. Since this is interpreted as a bit field, I decided to use long
to not change representation, when the highest bit is set. Do you prefer using integer
, since only the lowest bits are currently in use?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am going to check, that for very attribute group there is a dropped attributes count. Thanks for pointing this out.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I renamed all the droppedEventsCount
to droppedAttributesCount
.
"type": "keyword", | ||
"ignore_above": 2048 | ||
} | ||
} | ||
}, | ||
"stacktrace": { | ||
"type": "text" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
```
"@timestamp": {
"type": "date_nanos"
},
"time": {
"type": "date_nanos"
},
"observedTimestamp": {
"type": "date_nanos"
},
"observedTime": {
"type": "alias",
"path": "observedTimestamp"
},
```
We have four fields related to timestamp.
We have two in otel proto files, can we consolidate? If they are required why?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was in the original file. The OTel specification only guarantees, that at least one of time
and observedTime
is set. It is not telling which one. I think, @timestamp
is supposed to always contain a timestamp for each message. Furthermore, it provides a common timestamp field for all three signals. The alias seems to be there for compatibility.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think @timestamp
is opensearch specific. Correct me if I am wrong. And observedTime
is incorrect. I think observedTimestamp
is correct. I do not think my new code generates "@timestamp" and "observedTime". The mappings have "observedTime" as alias for "observedTimestamp", which may be removed but I think it's ok to keep it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Time
vs Timestamp
is essentially OpenTelemetry proto-files vs Logs Data Model. I am fine with either naming as long as it is consistent.
"properties": { | ||
"droppedEventsCount": { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dropperAttributesCount
. Also can you validate the existing mapping with
https://github.com/open-telemetry/opentelemetry-proto/blob/main/opentelemetry/proto/trace/v1/trace.proto
I see a lot of missing pieces.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we did some analysis here: opensearch-project/opensearch-catalog#218
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am going to do that.
- renaming of droppedEventsCount to droppedAttributesCount - streamlined timestamps to include OTel proto defined times and on @timestamp for index ordering in OS - removed datastream attributes which are covered by dynamic template anyways - removed schemaUrl from logs and metrics but kept in scope and resource - added details on exemplars, links and events Signed-off-by: Karsten Schnitter <[email protected]>
I addressed the comments by @ps48 and @vamsimanohar. There are some open questions:
|
Signed-off-by: Karsten Schnitter <[email protected]>
} | ||
}, | ||
{ | ||
"default_resource_attributes": { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@KarstenSchnitter Is this correct? Default attributes can be objects also, right? It looks like, here you are assuming them to be strings. I think it is possible for attributes to be nested like
"resource": {
"attributes" : {
"data" : {
"key1": "value1"
"key2": 2
},
"key3": "value3"
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Due to the path_match
this is applied to all nested fields. With the three dynamic mappings your example would have the values of key1 indexed as keyword, of key2 as log, and of key3 as keyword.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@KarstenSchnitter I am seeing this error with the mappings you provided -
WARN org.opensearch.dataprepper.plugins.sink.opensearch.BulkRetryStrategy - index = otel-v1-apm-span-000001, operation = Index, status = 400, error = object mapping for [resource.attributes.host.image.id] tried to parse field [host.image.id] as object, but found a concrete value
Any idea why this error is occurring?
The mappings from the server
dynamic_templates" : [
{
"long_resource_attributes" : {
"path_match" : "resource.attributes.*",
"mapping" : {
"type" : "long"
},
"match_mapping_type" : "long"
}
},
{
"double_resource_attributes" : {
"path_match" : "resource.attributes.*",
"mapping" : {
"type" : "double"
},
"match_mapping_type" : "double"
}
},
{
"default_resource_attributes" : {
"path_match" : "resource.attributes.*",
"mapping" : {
"type" : "object"
}
}
},
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have deleted the "default_resource_attributes", "default_scope_attributes" and "default_attributes" and then I was able to store the data and see the data using opensearch dashboards.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like a typical type mismatch in the error message. Note, that the default_resource_attributes
is not what I provided. It should have type keyword
. Do you have a procedure to reproduce this issue?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@KarstenSchnitter sorry, that was copy-paste mistake. Even with the correct mappings with "keyword", I am still seeing the error. For now, I am going to take out the default attributes section in all three and updated the PR. These changes are working fine for traces,logs and metrics. Please submit another PR separately (on top of this) for any changes you want to make after I merge these change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please do not take it out. Mapping to keywords is the essential default we need. Can you try with an additional "match_mapping_type": "string"
instead? I have not seen this problem during my testing. Can you provide the events you were testing with?
} | ||
}, | ||
{ | ||
"default_attributes": { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment here. General attributes may also have nested attributes, right?
} | ||
"serviceName": { | ||
"type": "alias", | ||
"path": "resource.attributes.service.name" | ||
}, | ||
"event": { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will remove this.
"type": "keyword", | ||
"ignore_above": 2048 | ||
} | ||
} | ||
}, | ||
"stacktrace": { | ||
"type": "text" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think @timestamp
is opensearch specific. Correct me if I am wrong. And observedTime
is incorrect. I think observedTimestamp
is correct. I do not think my new code generates "@timestamp" and "observedTime". The mappings have "observedTime" as alias for "observedTimestamp", which may be removed but I think it's ok to keep it.
@@ -6,58 +6,129 @@ | |||
"_source": { | |||
"enabled": true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@KarstenSchnitter It looks like we need to bump up the version number to 3 (at least for spans). Otherwise new template mappings are not being sent to the OpenSearch server.
Events are not part of the OTel Logs Data Model [1]. The mapping for the SS4O fields has been removed. [1] https://opentelemetry.io/docs/specs/otel/logs/data-model/ Signed-off-by: Karsten Schnitter <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I removed the event field from the logs mapping and answered some of the review comments. Timestamp names remain an open discussion.
} | ||
}, | ||
{ | ||
"default_resource_attributes": { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Due to the path_match
this is applied to all nested fields. With the three dynamic mappings your example would have the values of key1 indexed as keyword, of key2 as log, and of key3 as keyword.
} | ||
"serviceName": { | ||
"type": "alias", | ||
"path": "resource.attributes.service.name" | ||
}, | ||
"event": { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I take it out in my PR to your branch.
"type": "keyword", | ||
"ignore_above": 2048 | ||
} | ||
} | ||
}, | ||
"stacktrace": { | ||
"type": "text" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Time
vs Timestamp
is essentially OpenTelemetry proto-files vs Logs Data Model. I am fine with either naming as long as it is consistent.
}, | ||
"flags": { | ||
"type": "integer" | ||
"type": "long" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I renamed all the droppedEventsCount
to droppedAttributesCount
.
@@ -6,58 +6,129 @@ | |||
"_source": { | |||
"enabled": true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where does this limitation come from?
@KarstenSchnitter ignore my comment about version being 3. Will keep it as 1. If there is an existing template with version 1 already present, it needs to be removed. Otherwise, we have to bump this version 2. |
ecdd704
to
076047f
Compare
Description
Adjustment of the mapping templates for the standard OTel format.
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.