Issue #2689: Unsatisfactory behavior of snapshot selector by PeterF778 · Pull Request #2697 · signalfx/splunk-otel-java

PeterF778 · 2026-03-10T21:16:20Z

Change the algorithm for snapshot profiling selection to be exclusively based on trace-id. Removing the concepts of snapshot Volume, SnapshotVolumePropagator, and ProbabilisticSnapshotSelector. Updating unit tests.

github-actions · 2026-03-10T21:16:32Z

All contributors have signed the CLA ✍️ ✅
_{Posted by the CLA Assistant Lite bot.}

PeterF778 · 2026-03-10T21:25:17Z

recheck

breedx-splk

Wow thanks, this is sooooo much nicer! 🏆

...r/src/main/java/com/splunk/opentelemetry/profiler/snapshot/TraceIdBasedSnapshotSelector.java

…hot/TraceIdBasedSnapshotSelector.java Co-authored-by: jason plumb <75337021+breedx-splk@users.noreply.github.com>

laurit · 2026-03-13T14:08:29Z

profiler/src/main/java/com/splunk/opentelemetry/profiler/snapshot/SnapshotVolumePropagator.java

-   * capable agents can make the same snapshotting decision, if necessary.
-   */
-  @Override
-  public <C> Context extract(Context context, C carrier, TextMapGetter<C> getter) {


My understanding is that previously the decision whether to profile or not was propagated from upstream service using the volume baggage entry. So the first called service decided in some way whether to profile or not an other services followed that decision.
After these changes every service will decide independently whether to profile or not. If they are using the same algorithm with the same probability then they'll reach the same decision. If they use a different probability then they may reach a different decision. Do we need to confirm with someone that removing the behavior that the first service decides whether to profile or not is ok?
Secondly if we remove it now and replace the selection algorithm then services running old and new code will not reach the same profiling decision. Is this ok?

My understanding is that previously the decision whether to profile or not was propagated from upstream service using the volume baggage entry. So the first called service decided in some way whether to profile or not an other services followed that decision. After these changes every service will decide independently whether to profile or not. If they are using the same algorithm with the same probability then they'll reach the same decision. If they use a different probability then they may reach a different decision. Do we need to confirm with someone that removing the behavior that the first service decides whether to profile or not is ok? Secondly if we remove it now and replace the selection algorithm then services running old and new code will not reach the same profiling decision. Is this ok?

One important detail is that while the original intention of the design was to propagate the profiling decision from upstream, it actually did not work. See the results of my testing that are quoted in the ticket. So I think that we do not really have to worry about "breaking" that behavior, because it had been already broken.

Furthermore, I do not think that the intention of the old design was appropriate. I do not see a use case for profiling downstream services because upstream service said so. It looks to me that that design was a copycat from AppD agent which does send downstream a correlation token asking for taking a snapshot. But that was different. The reason for taking such snapshots in AppD were to preserve information about a particular transaction instance (request) - normally the AppD agent sends summaries only. In OTel, there's no need for that, as this functionality comes free with every request/trace.

Thanks for the explaination.

…ector # Conflicts: # profiler/src/test/java/com/splunk/opentelemetry/profiler/snapshot/SnapshotProfilingConfigurationCustomizerProviderTest.java # profiler/src/test/java/com/splunk/opentelemetry/profiler/snapshot/SnapshotVolumePropagatorComponentProviderTest.java

robsunday · 2026-03-19T09:43:52Z

This PR cannot be merged because one of the commits is not signed.

Issue signalfx#2689: Unsatisfactory behavior of snapshot selector

c6e81e6

Change the algorithm for snapshot profiling selection to be exclusively based on trace-id. Removing the concepts of snapshot Volume, SnapshotVolumePropagator, and ProbabilisticSnapshotSelector. Updating unit tests.

PeterF778 requested review from a team as code owners March 10, 2026 21:16

PeterF778 mentioned this pull request Mar 10, 2026

Unsatisfactory behavior of snapshot selector #2689

Open

breedx-splk approved these changes Mar 11, 2026

View reviewed changes

robsunday approved these changes Mar 12, 2026

View reviewed changes

PeterF778 and others added 2 commits March 12, 2026 11:48

Update profiler/src/main/java/com/splunk/opentelemetry/profiler/snaps…

5233166

…hot/TraceIdBasedSnapshotSelector.java Co-authored-by: jason plumb <75337021+breedx-splk@users.noreply.github.com>

Update profiler/src/main/java/com/splunk/opentelemetry/profiler/snaps…

f2a1392

…hot/TraceIdBasedSnapshotSelector.java Co-authored-by: jason plumb <75337021+breedx-splk@users.noreply.github.com>

laurit reviewed Mar 13, 2026

View reviewed changes

laurit approved these changes Mar 17, 2026

View reviewed changes

robsunday mentioned this pull request Mar 19, 2026

Issue #2689: Unsatisfactory behavior of snapshot selector [recreated] #2717

Open

robsunday closed this Mar 19, 2026

github-actions bot locked and limited conversation to collaborators Mar 19, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Issue #2689: Unsatisfactory behavior of snapshot selector#2697

Issue #2689: Unsatisfactory behavior of snapshot selector#2697
PeterF778 wants to merge 4 commits intosignalfx:mainfrom
PeterF778:2689_Unsatisfactory_behavior_of_snapshot_selector

PeterF778 commented Mar 10, 2026

Uh oh!

github-actions bot commented Mar 10, 2026 •

edited

Loading

Uh oh!

PeterF778 commented Mar 10, 2026

Uh oh!

breedx-splk left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

laurit Mar 13, 2026

Uh oh!

PeterF778 Mar 16, 2026

Uh oh!

laurit Mar 17, 2026 •

edited

Loading

Uh oh!

robsunday commented Mar 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

PeterF778 commented Mar 10, 2026

Uh oh!

github-actions bot commented Mar 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

PeterF778 commented Mar 10, 2026

Uh oh!

breedx-splk left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

laurit Mar 13, 2026

Choose a reason for hiding this comment

Uh oh!

PeterF778 Mar 16, 2026

Choose a reason for hiding this comment

Uh oh!

laurit Mar 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

robsunday commented Mar 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

github-actions bot commented Mar 10, 2026 •

edited

Loading

laurit Mar 17, 2026 •

edited

Loading