chore(traces): Add RFC for typed internal trace data model#25329
chore(traces): Add RFC for typed internal trace data model#25329bruceg wants to merge 4 commits into
Conversation
8490c1e to
492e304
Compare
There was a problem hiding this comment.
check-spelling found more than 20 potential problems in the proposed changes. Check the Files changed tab for more details.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: cdae2cc44a
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
492e304 to
22953b6
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 22953b6585
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
aaffd17 to
1a2edc5
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 1a2edc5d4f
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
1a2edc5 to
154f9c1
Compare
950d6c7 to
fe697cc
Compare
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: fe697cc61b
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
fe697cc to
08848d0
Compare
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 08848d06b0
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
08848d0 to
c6d5ea1
Compare
288c2a4 to
5a29737
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 5a297373e8
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
5a29737 to
5179bba
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 5179bbaf83
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
5179bba to
6c4a5e1
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 6c4a5e1bf0
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
6c4a5e1 to
818ee94
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 818ee943c8
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
afc6bde to
c72e53b
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: c72e53b786
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
c72e53b to
7e414c7
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 7e414c714d
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
7e414c7 to
ed66b4c
Compare
ed66b4c to
f057f0c
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: f057f0c5d2
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
f057f0c to
cd6fe59
Compare
|
Since the document was getting huge and over-specific, I have broken the RFC down into three documents. The primary RFC specifies the data formats and primary plan of attack, and two sub-documents specify Datadog and OTLP ingress and egress concerns. |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: cd6fe59383
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: c822695b90
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 202d6b9c50
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| VRL accesses an `Attributes` map through the existing `Value` API; the `VrlTarget` | ||
| boundary inspects `Value::Bytes` writes for UTF-8 to choose between `AttrValue::String` | ||
| and `AttrValue::Bytes`, so a VRL read-and-unchanged-write preserves the discriminator | ||
| whenever the bytes are valid UTF-8. Other conversions fall out of the `Value` |
There was a problem hiding this comment.
Preserve bytes discriminator on VRL rewrites
For an OTLP attribute that ingests as bytes_value but happens to contain valid UTF-8, a VRL read-and-unchanged-write cannot preserve the discriminator with this rule: the read surfaces only Value::Bytes, and the write side then chooses AttrValue::String based on UTF-8 because Value has no separate string/bytes provenance. A remap like assigning an attributes map back to itself would therefore re-emit string_value instead of the original bytes_value; the RFC needs either hidden provenance/Cow semantics for unchanged reads or a write rule that does not infer String from valid UTF-8 bytes.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
This reads like a little too much magic, where assigning the same string is somehow different behavior than any modification, even trivial ones. I'm not sure this is worth it.
| continue to decode. The per-component shim functions persist alongside the deprecated | ||
| proto as wire decoders -- dispatched on the `vector.trace_legacy_layout` hint (which | ||
| travels in `EventMetadata.value` inside `LegacyTrace`) -- so that decoded legacy records |
There was a problem hiding this comment.
Handle pre-hint legacy buffer records
For disk buffers or vector traffic written before the legacy-layout hint precursor ships, the decoded LegacyTrace will not contain vector.trace_legacy_layout, yet this deprecation-window path dispatches the only remaining shims solely on that hint after the Rust Legacy arm is removed. Those older buffered records would fail to materialize as typed traces even though this paragraph promises events written by older Vector instances continue to decode; please add a fallback for pre-hint records where possible or require draining/segregating buffers before the collapse step.
Useful? React with 👍 / 👎.
tessneau
left a comment
There was a problem hiding this comment.
couple questions but nothing blocking, this already feels quite thorough 👍
| Drop granularity is structural and uniform across sources: a zero `SpanLink.span_id` or | ||
| `SpanLink.trace_id` drops only the affected link (`Span.dropped_links_count` is incremented, a | ||
| counter is incremented, and a warning log identifies the drop); a zero `Span.trace_id` or | ||
| `Span.span_id` drops the enclosing span; a candidate `TraceEvent` whose every span was rejected is |
There was a problem hiding this comment.
detail -- should we have telemetry for span drops as well?
| well-formed `TraceEvent`, so consumers do not need to discriminate between `None` and a | ||
| present-but-empty value at the typed surface. Every construction site normalizes an | ||
| empty input to `None`; the per-format applications are specified in the corresponding | ||
| sub-RFC. |
There was a problem hiding this comment.
is the hidden normalization worth it vs just passing these through? if the DD and OTLP specs explicitly set or do not set these fields then could there be reason for a field to be an empty string and we should just pass this through? Or at least have the observability that it was the producer which set this to empty and not vector
| - **`Vec` element** (`.spans[i]`, `.spans[i].events[j]`, `.spans[i].links[j]`): `del()` | ||
| removes the i-th / j-th element; the vector shrinks and subsequent indices renumber. | ||
| - **Required typed sub-field with a representable default value**: `del()` is | ||
| equivalent to writing the sub-field's default value. Examples: |
There was a problem hiding this comment.
I'm tempted to handle these required fields the same as the rest -- with a VRL runtime error and users could still write a different value. It feels easier to me to think about over del() being an implicit reset, and the defaults don't seem ambiguous, besides unset perhaps.
| (`priority` / `origin` / `droppedTrace` / `tags`, `spans[]`). | ||
| - [`span.proto`](https://github.com/DataDog/datadog-agent/blob/main/pkg/proto/datadog/trace/span.proto) | ||
| -- the per-span shape (`service`, `name`, `resource`, `traceID`, `spanID`, | ||
| `parentID`, `start`, `duration`, `error`, `meta`, `metrics`, `type`, `meta_struct`). |
There was a problem hiding this comment.
| `parentID`, `start`, `duration`, `error`, `meta`, `metrics`, `type`, `meta_struct`). | |
| `parentID`, `start`, `duration`, `error`, `meta`, `metrics`, `type`, `meta_struct`, `spanEvents`, `spanLinks`). |
Summary
Sets up a comprehensive RFC documenting a proposed strongly-typed trace data model for Vector, including semantically-lossless conversions to and from OLTP and Datadog APM formats.
This is a different path than #20170 which explicitly takes into account the Datadog-Vector-Datadog path rather than just adopting one format or another.
Rendered:
Vector configuration
N/A
How did you test this PR?
N/A
Change Type
Is this a breaking change?
Does this PR include user facing changes?
no-changeloglabel to this PR.References
opentelemetrysource and sink #1444datadog_agentincorrectly castsu64values from the Agent asi64in order to fit into theValueenum. #14687Notes
@vectordotdev/vectorto reach out to us regarding this PR.pre-pushhook, please see this template.make fmtmake check-clippy(if there are failures it's possible some of them can be fixed withmake clippy-fix)make testgit merge origin masterandgit push.Cargo.lock), pleaserun
make build-licensesto regenerate the license inventory and commit the changes (if any). More details on the dd-rust-license-tool.