-
Notifications
You must be signed in to change notification settings - Fork 15
[crashtracking] Allow runtimes to register runtime stack collection callbacks #1252
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
✅ Tests 🎉 All green!❄️ No new flaky tests detected 🔗 Commit SHA: 2e9dcc8 | Docs | Datadog PR Page | Was this helpful? Give us feedback! |
BenchmarksComparisonBenchmark execution time: 2025-11-03 15:31:04 Comparing candidate commit 2e9dcc8 in PR branch Found 12 performance improvements and 4 performance regressions! Performance is the same for 39 metrics, 2 unstable metrics. scenario:benching serializing traces from their internal representation to msgpack
scenario:credit_card/is_card_number/37828224631000521389798
scenario:credit_card/is_card_number/x371413321323331
scenario:credit_card/is_card_number_no_luhn/ 378282246310005
scenario:credit_card/is_card_number_no_luhn/378282246310005
scenario:credit_card/is_card_number_no_luhn/37828224631000521389798
scenario:credit_card/is_card_number_no_luhn/x371413321323331
scenario:normalization/normalize_service/normalize_service/[empty string]
scenario:redis/obfuscate_redis_string
CandidateCandidate benchmark detailsGroup 1
Group 2
Group 3
Group 4
Group 5
Group 6
Group 7
Group 8
Group 9
Group 10
Group 11
Group 12
Group 13
Group 14
Group 15
Group 16
Group 17
BaselineOmitted due to size. |
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #1252 +/- ##
==========================================
- Coverage 72.11% 71.70% -0.42%
==========================================
Files 368 373 +5
Lines 58111 58987 +876
==========================================
+ Hits 41907 42295 +388
- Misses 16204 16692 +488
🚀 New features to boost your workflow:
|
Artifact Size Benchmark Reportaarch64-alpine-linux-musl
aarch64-unknown-linux-gnu
libdatadog-x64-windows
libdatadog-x86-windows
x86_64-alpine-linux-musl
x86_64-unknown-linux-gnu
|
aa11ce8 to
3ee9ed8
Compare
2846753 to
5fad559
Compare
9f05ba9 to
bc09cd5
Compare
This stack of pull requests is managed by Graphite. Learn more about stacking. |
8a36ebb to
d86a9c3
Compare
|
TODO: Fix Frame emitting allocation issue |
c476d2a to
f078323
Compare
283cb7b to
7d1bfee
Compare
86bea96 to
7d1bfee
Compare
|
TODO: add a |
7d1bfee to
6830707
Compare
e608cf5 to
f80970e
Compare
| backtrace = "=0.3.74" | ||
| chrono = {version = "0.4", default-features = false, features = ["std", "clock", "serde"]} | ||
| ddcommon = {path = "../ddcommon" } | ||
| ddcommon-ffi = {path = "../ddcommon-ffi" } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Non-ffi crates should not depend on ffi crates. Actually, I don't see that you're even using that dep in this crate? Leftover maybe?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah I am. I was defining the runtime stack frame struct in the non ffi crate and exporting it in the ffi crate. I should probably have an FFI specific version?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes.

What does this PR do?
This PR allows runtimes to register a callback to extract runtime stack. They can choose either to emit frames frame by frame, or to dump a whole stacktrace string. The important contract is that the logic to retrieve the runtime stack is happening within a fork of the crashing process, from a signal handler, so it must be async-signal-safe.
Currently, we add runtime stacks as a new
runtime_stacksfield in theExperimentalfield. If runtimes choose to emit frames one by one usingddog_RuntimeStackFrame, theruntime_stacksfield will be nicely propagated. If they choose to dump the whole traceback string, additional parsing will have to be implemented in theReceiverside appropriate for each different runtime's style/syntax of tracebacks.Motivation
Current crash tracking captures only native stack traces, which are insufficient for applications using interpreted languages. When a Python/Ruby/PHP application crashes, developers need visibility into both:
Without runtime stack traces, debugging crashes in interpreted languages is significantly hampered as the native stack only shows interpreter internals and native extension modules, not the actual application code execution path.
Additional Notes
Anything else we should know when reviewing?
How to test the change?
Unit tests.
There is a very dummy implementation of
dd-trace-pyconsuming this API in this experimental PR: DataDog/dd-trace-py#14765By triggering a crash with the tracer and agent attached, we can see outputted
Experimentalfields: