Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added parsing otel json trace to span from kafka topic #5590

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

Mamol27
Copy link

@Mamol27 Mamol27 commented Apr 9, 2025

Description

Parsing of OpenTelemetry trace in JSON format into Span has been implemented to support trace handling via Kafka.
We had to implement it, as the client requires all data to be sent exclusively through Kafka.
To implement this, a new MessageFormat was added.

Issues Resolved

Resolves #5446

Check List

  • New functionality includes testing.
  • New functionality has a documentation issue. Please link to it in this PR.
    • New functionality has javadoc added
  • Commits are signed with a real name per the DCO

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@kkondaka
Copy link
Collaborator

@Mamol27 Thank you for your submission. Wondering if it makes sense to use "JSON" message format (because the input is in json) and then have something like "codec" which would convert JSON to OTEL trace.
Secondly, I am not sure if you need to have JsonOtelTracePojo.java That should be default behavior when using OTelProtoStandardCodec..OTelProtoEncoder.convertToResourceSpans

Copy link
Collaborator

@KarstenSchnitter KarstenSchnitter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think, there is an easier way leveraging the protobuf JsonFormat. This can lead to more code reusage in the mapping of the OTel data. This would be desirable to the project, as it has a lower maintenance effort.

I can only recommend to use the protobuf binary representation to store in Kafka over JSON. There is a much higher throughput in serialization and deserialization of protobuf over JSON.

@@ -0,0 +1,201 @@
package org.opensearch.dataprepper.plugins.kafka.parcer;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably a typo parcer -> parser

Copy link
Author

@Mamol27 Mamol27 Apr 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you


import java.util.List;

public class JsonOtelTracePojo {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you need a new class for the mapping?

Have you tried parsing the json via protobuf?

        ResourceSpans.Builder builder = ResourceSpans.newBuilder();
        JsonFormat.parser().merge(json, builder);

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, I couldn’t find a standard parser.


private List<Span> parseSpans(String inputJson, Instant receivedTimeStamp) throws IOException {

ObjectMapper objectMapper = new ObjectMapper();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it necessary to recreate the ObjectMapper on each call? Have a look at my comment in the JsonOtelTracePojo for an alternative. Basically, I think you could do this instead:

        ResourceSpans.Builder builder = ResourceSpans.newBuilder();
        JsonFormat.parser().merge(json, builder);

You can now use the builder to access all fields from the data. Essentially, this then requires the same mapping as in the trace source.

Furthermore, the protobuf approach would allow you easily to consume protobuf binary data in a future extension. Replacing JSON with protobuf will provide a considerably higher throughput.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately, when sending a ProtoBuf message to Kafka, it gets split into multiple parts. Even after merging them back together and using the OTelTraceDecoder.parse method, I still received an error indicating an invalid format. Therefore, we decided to send traces through Kafka in JSON format instead.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

OTel Trace from kafka source with otel_trace_raw
3 participants