Skip to content

Resume HTTP/2 traceparent scan after invalid candidate#2519

Open
MrAlias wants to merge 3 commits into
open-telemetry:mainfrom
MrAlias:fix-http/2-traceparent-candidate-validation
Open

Resume HTTP/2 traceparent scan after invalid candidate#2519
MrAlias wants to merge 3 commits into
open-telemetry:mainfrom
MrAlias:fix-http/2-traceparent-candidate-validation

Conversation

@MrAlias

@MrAlias MrAlias commented Jun 26, 2026

Copy link
Copy Markdown
Contributor

Motivation

  • The HTTP/2 HPACK detector previously stopped at the first name/value-length fingerprint match and validated only that candidate.
  • When that first candidate looked like traceparent but had an invalid value, OBI created and injected a new traceparent instead of continuing to a later valid one.
  • Continuing after malformed candidates prevents duplicate traceparent injection while still adopting an existing valid context when one is present later in the HPACK block.

Description

  • Initialize the per-frame HPACK candidate scan offset when starting detection for a HEADERS frame.
  • Retry malformed HPACK traceparent candidates by advancing past the full literal entry before scanning again:
    • k_h2_tp_hpack_size for plain traceparent names.
    • k_h2_tp_hpack_huffman_size for Huffman-encoded traceparent names.
  • Count malformed-candidate retries against the existing HTTP/2 per-packet tail-call budget, preserving room for create/write tail calls.
  • Keep the retry path verifier-friendly by shifting the pulled HPACK window for each retry instead of making the 192-byte scan loop skip to a saved scalar offset.
  • Validate a found candidate from its pulled candidate window directly, so packet access stays at fixed offsets and avoids verifier state explosion on arm64.

Testing

  • clang-format --dry-run --Werror bpf/tpinjector/tpinjector.c
  • make generate

@MrAlias MrAlias requested a review from a team as a code owner June 26, 2026 17:01
@MrAlias MrAlias added bug Something isn't working ebpf Issues or PRs that primarily require eBPF program changes area: tracing Trace context, span construction, and trace attribute behavior labels Jun 26, 2026
@codecov

codecov Bot commented Jun 26, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 69.12%. Comparing base (914ffa4) to head (cf6e1ff).
⚠️ Report is 5 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2519      +/-   ##
==========================================
- Coverage   69.24%   69.12%   -0.13%     
==========================================
  Files         345      345              
  Lines       46747    46795      +48     
==========================================
- Hits        32372    32348      -24     
- Misses      12330    12413      +83     
+ Partials     2045     2034      -11     
Flag Coverage Δ
integration-test 50.35% <ø> (-0.57%) ⬇️
integration-test-arm 27.27% <ø> (-1.04%) ⬇️
integration-test-vm-5.15-lts 27.79% <ø> (-0.04%) ⬇️
integration-test-vm-6.18-lts 29.28% <ø> (+2.03%) ⬆️
k8s-integration-test 35.73% <ø> (+0.03%) ⬆️
oats-test 35.68% <ø> (+0.32%) ⬆️
unittests 63.36% <ø> (+0.04%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@MrAlias

This comment was marked as resolved.

MrAlias added 2 commits June 26, 2026 10:23
Scan retries now shift the HPACK pull window instead of teaching the
candidate finder to skip a saved offset inside its 192-byte loop. That
keeps packet reads at fixed loop offsets and avoids the verifier state
explosion seen on arm64.

Validate candidates from a pulled candidate window as well, preserving
the malformed traceparent retry behavior while removing another
scalar-target walk.
Clarify that h2_tp_candidate_pos carries either the next HPACK
retry scan start or the found candidate offset that validation should
use. The old wording only described the scan side of the state.
@MrAlias

MrAlias commented Jun 26, 2026

Copy link
Copy Markdown
Contributor Author

@rafaelroquetto please take a look at this one. I think this should address the issue, but not sure if my workaround the verifier limitations is optimal.

@grcevski grcevski left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! This looks great!

I originally thought that finding invalid TP when we see 'traceparent' would be very rare and questionable on what we should do, I realised though that this infrastructure you added will help us what we want to do next, find traceparents when they are hpack encoded. In those situations, we will likely need to loop further with failed validation. So I think this is pretty cool.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area: tracing Trace context, span construction, and trace attribute behavior bug Something isn't working ebpf Issues or PRs that primarily require eBPF program changes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants