Skip to content

Export Snapshot Profiling Stack Traces As OpenTelemetry Log Messages#2237

Merged
laurit merged 32 commits intosignalfx:mainfrom
tduncan:port-profile-log-export
Mar 25, 2025
Merged

Export Snapshot Profiling Stack Traces As OpenTelemetry Log Messages#2237
laurit merged 32 commits intosignalfx:mainfrom
tduncan:port-profile-log-export

Conversation

@tduncan
Copy link
Copy Markdown
Contributor

@tduncan tduncan commented Mar 13, 2025

This PR plugs in the log exporting functionality of the snapshot profiler which completes the end-to-end extension data flow. The full extension sequence is now:

  1. Trace is selected for snapshotting
  2. Start of root span of trace is detected
  3. Profiling for trace thread starts
  4. Collected stack traces are staged in the AccumulatingStagingArea
  5. End of root span of trace is detected
  6. Profiling of trace thread stops
  7. Stack traces for trace are emptied from StagingArea to StackTraceExporter
  8. Stack traces are converted into the pprof data format
  9. Otel LogRecordData is created using pprof as log body
  10. Otel LogRecordData is emitted to the configured Otel LogRecordProcessor

The configuration of the Otel Logger required adding an additional AgentListener to obtain the Resource, which in turn necessitated a 2-step initialization and an additional abstraction layer within AccumulatingStagingArea.

The need to utilize the AgentListener callback also required further simulating of the agent startup process which was done by updating OpenTelemetrySdkExtension to support adding AgentListeners and calling into them prior to a test running. This did require OpenTelemetrySdkExtension to extend AutoConfiguredOpenTelemetrySdk and move into shadow package to actually implement the abstract methods along with renaming the builder function to configure so as to not conflict with builder function already defined.

@tduncan tduncan requested review from a team as code owners March 13, 2025 18:21
annotationProcessor("com.google.auto.service:auto-service")
compileOnly("com.google.auto.service:auto-service")

testImplementation(project(":custom"))
Copy link
Copy Markdown
Contributor Author

@tduncan tduncan Mar 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I needed access to the AutoConfigureUtil class defined in this subproject. Used in StackTraceExporterActivator to access the ConfigProperties and Resource fields in the AutoConfiguredOpenTelemetrySdk when configuring the Otel Logger.

The compileOnly access already defined did not allow me to test that the configuration was taking place.


public static boolean isSnapshotProfilingEnabled(ConfigProperties properties) {
return properties.getBoolean(CONFIG_KEY_ENABLE_SNAPSHOT_PROFILER, false);
}
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A 2nd use was added in StackTraceExporterActivator so I thought it made sense to roll this into the Configuration class.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adapted from code in JfrActivator. I intentionally didn't modify the existing profiling code to minimize the scope of this PR.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very similar to the existing Pprof class, mostly QoL/DX changes. I decided to port the code AS IN, duplications and all, due to timeline constraints. Once the extension functionality is in place my intention is to figure out how to unify the two slightly different strategies used by the continuous profiler and the snapshot profiler.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to PprofCpuEventExporter though slightly simpler due to guarantees that thread and trace information will be available.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sets up the StackTraceExporter instance so that collected stack traces can be exported as Otel logs. If this step doesn't happen traces may be profiled but the data will be dropped, not exported.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This class acts a middleman between AccumulatingStagingArea and the StackTraceExporter. Its necessary because the StagingArea is instantiated during the SDK configuration phase (see updates in SnapshotProfilingSdkCustomizer) but the Otel Logger can't be configured until later in agent startup process (see newly added StackTraceExporterActivator).

To insulate the AccumulatingStagingArea from this 2-step initialization requirement it uses this Supplier to obtain a StackTraceExporter, always receiving a no-op implementation if the initialization process has yet to complete.

Copy link
Copy Markdown
Contributor Author

@tduncan tduncan Mar 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Component test of the extension's ability to export logs. Selects a trace for snapshotting, starts and stops a root span, and verifies that a log record lands in the configured Otel LogRecordExporter. Utilizes the additional simulation of an agent startup in OpenTelemetrySdkExtension.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The primary changes in this class is the ability to add attach AgentListener instances and simulating that portion of the agent startup process. To do this requires a AutoConfiguredOpenTelemetrySdk instance which is difficult to obtain on a per-test basis. The easiest approach was to subclass AutoConfiguredOpenTelemetrySdk which unfortunately required moving the entire JUnit extension into this shadow package so two of the abstract methods could be implemented.

@tduncan
Copy link
Copy Markdown
Contributor Author

tduncan commented Mar 13, 2025

There is obviously a lot of overlap between code added in this PR and existing code for the continuous profiler. My initial aim was to unify the two approaches by adapting the new snapshot profiling code to use the existing constructs, however as I started down that path it became clear there were enough nuances between the implementations that there wasn't a straightforward way to reconcile the differences. Given that I decided the safest thing to do is port the snapshot profiling log exporting code more-or-less as is and attempt the pprof translation and log data export unification effort later. Also looming is a fairly tight timeline for making this functionality available for demos.

* which is itself adapted from Google's Bazel build system <a
* href=https://github.com/bazelbuild/bazel/blob/master/src/main/java/com/google/devtools/build/lib/profiler/memory/AllocationTracker.java>here</a>.
*/
class Pprof {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To me this copy paste does not make sense. Have a look at https://github.com/signalfx/splunk-otel-java/pull/1306/files#diff-ca57673d0f9c0ab8a63ac33abeb7aa277d52fce7a74774eb6e2ace458c01c88e Would it be complicated to use a similar approach? If it helps I guess we could merge consider merging that PR (needs approval from @breedx-splk) or you could copy from that PR. Note that gdi spec dictates the values for missing stack trace items (line numbers, method, class and file names).

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried to explain here and here why I went this way.

The longer version is that I was tinkering for a few hours on how to reconcile the differences between the StackTrace defined in the snapshot package with the StackTrace defined in StackTraceParser used in PprofCpuEventExporter. The two data classes are very similar with the primary differences being the osThreadId field and how the stack frames are represented (StackTraceLine vs the built-in StackTraceElement). The approach I was exploring was to define StackTrace and StackFrame interfaces with the relevant accessor methods.

The next obstacle was around the building and serialization of the profiling data. PprofCpuEventExporter maintains a mutable Pprof instance that it resets after being flushed where as the design of the snapshot profiling extension does not maintain any mutable state during the log export process. The location of the serialization code also differs with it being found in the Pprof class itself vs in the AsyncLogExporter for snapshots.

There's also a minor DX change on the snapshot profiling Pprof version where the label creation and adding of labels to Samples is separated into two distinct operations and pushed up a level into the PprofTranslator as well the addition a few convenience methods to accept a StackTraceElement. Ex:

// Always On
...
if (stackTrace.getThreadId() != 0) {
  pprof.addLabel(sample, THREAD_ID, stackTrace.getThreadId());
  pprof.addLabel(sample, THREAD_NAME, stackTrace.getThreadName());
}
...
for (StackTraceParser.StackTraceLine stl : stackTrace.getStackTraceLines()) {
  sample.addLocationId(
      pprof.getLocationId(
          stl.getLocation(), stl.getClassName(), stl.getMethod(), stl.getLineNumber()));
  pprof.incFrameCount();
}

// Snapshot Profiling
...
sample.addLabel(pprof.newLabel(THREAD_ID, stackTrace.getThreadId()));
sample.addLabel(pprof.newLabel(THREAD_NAME, stackTrace.getThreadName()));
...
for (StackTraceElement stackFrame : stackTrace.getStackFrames()) {
  sample.addLocationId(pprof.getLocationId(stackFrame));
  pprof.incFrameCount();
}

Finally, I also thought it was a better idea to tackle the unification of the two stack trace data models and their translation into the pprof data format separately so as to not unintentionally break the existing Always On profiling extension while the snapshot profiling extension is being ported.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also see here for another place where I intentionally did not modify the existing profiling code to minimize the scope of this PR.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not something like tduncan#1

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@laurit, thanks for doing this! I've merged that PR into the branch, added tests, and also added new configuration properties to allow control over both the stack depth and sampling period of the snapshot profiler.

@@ -93,10 +97,12 @@ public void run() {
Instant now = Instant.now();
try {
ThreadInfo threadInfo = threadMXBean.getThreadInfo(threadId, MAX_ENTRY_DEPTH);
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the non-snapshot profiler there is a configuration option for this

public static int getStackDepth(ConfigProperties config) {
On thing you need to handle is that the for stack traces that are truncated you need to set an extra attribute. For this to be possible you may need to set a slightly larger limit here and manually truncate later.

import java.util.concurrent.ConcurrentMap;
import java.util.function.Supplier;

class AccumulatingStagingArea implements StagingArea {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here you collect stack traces per trace id and when the trace is done you export them. To me the staging area abstraction doesn't seem ideal. When there is a long request you end up accumulating many stack traces, when the request is short you export a small batch with only one or two traces. When there are multiple concurrent short requests you'll export many small batches. With the non-snapshot profiler we already get the input data in batches and the interval between the batches is large so we don't need to handle this. Did you consider alternative solutions like for example what is used in https://github.com/open-telemetry/opentelemetry-java/blob/main/sdk/trace/src/main/java/io/opentelemetry/sdk/trace/export/BatchSpanProcessorBuilder.java Place data in the queue, if queue has batch size items (there it is 512) export them, otherwise wait for a bit (there 5s) and export whatever is in the queue. If queue grows too large drop data.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sounds like a very good idea. Would you be against doing this as a followup (this PR is a partial blocker to adding span correlation)?

I don't think a time-based approach is necessarily the right approach however. What I'm envisioning is a BufferingStagingArea with some number of max stack traces before it automatically empties itself in addition to the forced empty upon detection of the entry span ending. I don't know what that number should be; maybe 20?

This would control how many stack traces are bundled into a single log record. The frequency of the log record exporting is separately controlled by the LogRecordProcessor configured in OtelLoggerFactory. Right now it wires up a SimpleLogRecordProcessor which I believe immediately exports the log record. We could use In the BatchLogRecordProcessor instead, though I've had issues getting this to work in the past.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you be against doing this as a followup

sure

I don't think a time-based approach is necessarily the right approach however. What I'm envisioning is a BufferingStagingArea with some number of max stack traces before it automatically empties itself in addition to the forced empty upon detection of the entry span ending. I don't know what that number should be; maybe 20?

Flushing when there are no new stack traces in some time isn't that hard. When you write stacks from all ongoing traces to the same pprof then flushing on ending the entry span doesn't make sense. 20 is imo too low, 2000 or even larger would be better. I think that determining the batch size requires some experimentation.

This would control how many stack traces are bundled into a single log record. The frequency of the log record exporting is separately controlled by the LogRecordProcessor configured in OtelLoggerFactory. Right now it wires up a SimpleLogRecordProcessor which I believe immediately exports the log record. We could use In the BatchLogRecordProcessor instead, though I've had issues getting this to work in the past.

Using BatchLogRecordProcessor here isn't ideal because pprof works best when you put lots of stacks into the same batch. When the same string is used multiple times then the shared string pool will help in reducing the transfer size. We need to manage creating these batches ourself to not export too often so we really don't gain much from using the BatchLogRecordProcessor.

@breedx-splk
Copy link
Copy Markdown
Contributor

Building an implementation of OpenTelemetry for testing is wild. Without having read this code yet, my instinct is that it's a mistake or at least a misstep.

private static final java.util.logging.Logger logger =
java.util.logging.Logger.getLogger(AsyncStackTraceExporter.class.getName());

private final ExecutorService executor = Executors.newSingleThreadScheduledExecutor();
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you meant to use newSingleThreadExecutor here.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Updated!

}
cpuEventExporter.flush();
} catch (Exception e) {
logger.log(Level.SEVERE, "an exception was thrown", e);
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps change to something like An exception was thrown while exporting profiling snapshots.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated!

stackTrace.getStackFrames(),
stackTrace.getTimestamp(),
stackTrace.getTraceId(),
null);
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure how the profiling UI will handle the missing span id here. Perhaps it would be better to use the local root span id here?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The next PR adding span correlation will fill this detail in. To your concern a different set of UI screens is being created for snapshot profiling data and the two sources of callstacks will not overlap at any point in the UI.


try {
sampler.start(spanContext);
await().atMost(HALF_SECOND).until(() -> !staging.allStackTraces().isEmpty());
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is best not to use short wait times like atMost(HALF_SECOND) as this can result in sporadic test failures

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed all of the explicit timeouts from these tests so the 10 second default is used.

@laurit laurit merged commit 61d2d25 into signalfx:main Mar 25, 2025
26 checks passed
@github-actions github-actions bot locked and limited conversation to collaborators Mar 25, 2025
@tduncan tduncan deleted the port-profile-log-export branch April 8, 2025 16:05
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants