Skip to content

[DO NOT MERGE] profiles: more efficient and flexible original_payload handling.#786

Open
jhalliday wants to merge 2 commits intoopen-telemetry:mainfrom
jhalliday:profiles-t
Open

[DO NOT MERGE] profiles: more efficient and flexible original_payload handling.#786
jhalliday wants to merge 2 commits intoopen-telemetry:mainfrom
jhalliday:profiles-t

Conversation

@jhalliday
Copy link
Copy Markdown
Contributor

Move original_payload data to a dictionary table to avoid duplication when a single payload encodes to multiple Profiles.

@jhalliday
Copy link
Copy Markdown
Contributor Author

Placing original_payload_format within the Profile message is inefficient in the case where the original format contains multiple streams of data, i.e. converts into multiple OTel Profiles. This is the case when e.g. JFR captures CPU and memory use data concurrently. Likewise it is possible, though less common, that a single Profile combines data from multiple original files, such as where files for consecutive time periods are combined by batching. Since the payload can be large, it is desirable to ensure no unnecessary duplication of data on the wire.

Note that there are related issues not addressed here: loss of the original file name, and lack of any way to identify which data streams within a file correspond to which Profiles. These issues are out of scope for this change and more likely to be addressed by semantic conventions than by protocol format changes e.g. defining attribute keys that can be used for these purposes.

Potentially breaking changes to the [Alpha] protocol are deferred pending more usage testing and feedback, to avoid interoperability issues at this time. This change is [DO NOT MERGE] for now and will be batched up with any other breaking changes at a future time.

@jhalliday
Copy link
Copy Markdown
Contributor Author

@open-telemetry/profiling-maintainers

@aalexand
Copy link
Copy Markdown
Member

aalexand commented Apr 8, 2026

A bit unusual to have such a large value in the dictionary but makes sense overall.

@jhalliday
Copy link
Copy Markdown
Contributor Author

Switched to using OriginalDataSource, a new message type that allows each payload content to be accompanied by attributes. These can be used via semconv to e.g. associate a file name with the data.

Comment on lines +355 to +362
// The format(s) of each file referenced by the corresponding original_payload_indices.
// Indexes into ProfilesDictionary.string_table.
repeated int32 original_payload_format_stridx = 12;
// The original payload bytes. See also original_payload_format_stridx.
// Optional, but format and the bytes must be set or unset together.
// i.e. original_payload_format_stridx and original_payload_indices must be arrays of equal length.
// Indexes into ProfilesDictionary.original_payload_table.
repeated int32 original_payload_indices = 13;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As there is a SemConv attribute dedicated to file formats, file.extension, I'm wondering if we can drop this field here.

Suggested change
// The format(s) of each file referenced by the corresponding original_payload_indices.
// Indexes into ProfilesDictionary.string_table.
repeated int32 original_payload_format_stridx = 12;
// The original payload bytes. See also original_payload_format_stridx.
// Optional, but format and the bytes must be set or unset together.
// i.e. original_payload_format_stridx and original_payload_indices must be arrays of equal length.
// Indexes into ProfilesDictionary.original_payload_table.
repeated int32 original_payload_indices = 13;
// The original payload bytes.
// Indexes into ProfilesDictionary.original_payload_table.
repeated int32 original_payload_indices = 12;

// The file or buffer content
bytes content = 1;
// References to attributes in ProfilesDictionary.attribute_table. [optional]
repeated int32 attribute_indices = 2;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we make file.extension and/or file.name a required attribute?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants