Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Instrument dd-trace-api #5145

Open
wants to merge 12 commits into
base: master
Choose a base branch
from
Open

Instrument dd-trace-api #5145

wants to merge 12 commits into from

Conversation

bengl
Copy link
Collaborator

@bengl bengl commented Jan 22, 2025

This is the initial PR that adds support for the (soon to be released) dd-trace-api module.

The dd-trace-api module works by sending all API calls through diagnostics channels, including the name of the function being called, all the arguments, and the this object. appropriate functions are passed through for any return values that need to be proxied.

What's proxying? Here, it refers to having a facade object presented by dd-trace-api that corresponds to a "real" object created by dd-trace. The facade allows dd-trace-api to never expose an internal object to the caller.

Testing is done primarily by checking that diagnostics channels map correctly to the internal function calls. Additional testing will take place in dd-trace-api to ensure the whole system works end-to-end.

@bengl bengl requested review from a team as code owners January 22, 2025 20:30
Copy link

github-actions bot commented Jan 22, 2025

Overall package size

Self size: 8.56 MB
Deduped: 94.95 MB
No deduping: 95.47 MB

Dependency sizes | name | version | self size | total size | |------|---------|-----------|------------| | @datadog/libdatadog | 0.4.0 | 29.44 MB | 29.44 MB | | @datadog/native-appsec | 8.4.0 | 19.25 MB | 19.26 MB | | @datadog/native-iast-taint-tracking | 3.2.0 | 13.9 MB | 13.91 MB | | @datadog/pprof | 5.5.1 | 9.79 MB | 10.17 MB | | protobufjs | 7.2.5 | 2.77 MB | 5.16 MB | | @datadog/native-iast-rewriter | 2.6.1 | 2.59 MB | 2.73 MB | | @opentelemetry/core | 1.14.0 | 872.87 kB | 1.47 MB | | @datadog/native-metrics | 3.1.0 | 1.06 MB | 1.46 MB | | @opentelemetry/api | 1.8.0 | 1.21 MB | 1.21 MB | | import-in-the-middle | 1.11.2 | 112.74 kB | 826.22 kB | | source-map | 0.7.4 | 226 kB | 226 kB | | opentracing | 0.14.7 | 194.81 kB | 194.81 kB | | lru-cache | 7.18.3 | 133.92 kB | 133.92 kB | | pprof-format | 2.1.0 | 111.69 kB | 111.69 kB | | @datadog/sketches-js | 2.1.0 | 109.9 kB | 109.9 kB | | semver | 7.6.3 | 95.82 kB | 95.82 kB | | lodash.sortby | 4.7.0 | 75.76 kB | 75.76 kB | | ignore | 5.3.1 | 51.46 kB | 51.46 kB | | shell-quote | 1.8.1 | 44.96 kB | 44.96 kB | | istanbul-lib-coverage | 3.2.0 | 29.34 kB | 29.34 kB | | rfdc | 1.3.1 | 25.21 kB | 25.21 kB | | @isaacs/ttlcache | 1.4.1 | 25.2 kB | 25.2 kB | | tlhunter-sorted-set | 0.1.0 | 24.94 kB | 24.94 kB | | limiter | 1.1.5 | 23.17 kB | 23.17 kB | | dc-polyfill | 0.1.4 | 23.1 kB | 23.1 kB | | retry | 0.13.1 | 18.85 kB | 18.85 kB | | jest-docblock | 29.7.0 | 8.99 kB | 12.76 kB | | crypto-randomuuid | 1.0.0 | 11.18 kB | 11.18 kB | | ttl-set | 1.0.0 | 4.61 kB | 9.69 kB | | path-to-regexp | 0.1.12 | 6.6 kB | 6.6 kB | | koalas | 1.0.2 | 6.47 kB | 6.47 kB | | module-details-from-path | 1.0.3 | 4.47 kB | 4.47 kB |

🤖 This report was automatically generated by heaviest-objects-in-the-universe

@pr-commenter
Copy link

pr-commenter bot commented Jan 22, 2025

Benchmarks

Benchmark execution time: 2025-01-30 20:07:42

Comparing candidate commit 09d3fa1 in PR branch bengl/instrument-dd-trace-api with baseline commit 6f79a86 in branch master.

Found 1 performance improvements and 0 performance regressions! Performance is the same for 821 metrics, 26 unstable metrics.

scenario:plugin-graphql-with-depth-off-18

  • 🟩 max_rss_usage [-117.935MB; -69.017MB] or [-12.396%; -7.254%]

const counter = apiMetrics.count('dd_trace_api.called', [
`name:${name.replace(':', '.')}`,
'api_version:v1',
`injection_enabled:${process.env.DD_INJECTION_ENABLED ? 'yes' : 'no'}`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can the process.env.DD_INJECTION_ENABLED check be hoisted to the outermost scope? I'm assuming it doesn't change throughout the lifetime of a process and it doesn't seem to be set during tests.

Also hopefully future config changes will make the config file easier to read in files like this one without needing to grab env vars everywhere.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This particular env var is typically only used when config hasn't been initialized yet, so it's never been part of our config system IIRC. I'll hoist it as requested.

Copy link

codecov bot commented Jan 28, 2025

Codecov Report

Attention: Patch coverage is 85.71429% with 11 lines in your changes missing coverage. Please review.

Project coverage is 81.09%. Comparing base (6f79a86) to head (09d3fa1).

Files with missing lines Patch % Lines
packages/datadog-plugin-dd-trace-api/src/index.js 86.66% 10 Missing ⚠️
packages/dd-trace/src/plugins/index.js 0.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #5145      +/-   ##
==========================================
+ Coverage   80.94%   81.09%   +0.14%     
==========================================
  Files         477      480       +3     
  Lines       21183    21419     +236     
==========================================
+ Hits        17146    17369     +223     
- Misses       4037     4050      +13     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@bengl bengl force-pushed the bengl/instrument-dd-trace-api branch from a5ecfb7 to 09d3fa1 Compare January 30, 2025 19:57
@datadog-datadog-prod-us1
Copy link

Datadog Report

Branch report: bengl/instrument-dd-trace-api
Commit report: 4ba936f
Test service: dd-trace-js-integration-tests

✅ 0 Failed, 615 Passed, 0 Skipped, 14m 14.66s Total Time

Copy link
Collaborator

@sabrenner sabrenner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

awesome stuff, almost all lgtm! i think having some context before reviewing helped (when we talked a few weeks ago), and looking along with the dd-trace-api-js helped a bunch. started a review because i thought i'd have more questions but really just the one 😅

Comment on lines +51 to +69
for (let i = 0; i < args.length; i++) {
if (objectMap.has(args[i])) {
args[i] = objectMap.get(args[i])
}
if (typeof args[i] === 'function') {
const orig = args[i]
args[i] = (...fnArgs) => {
for (let j = 0; j < fnArgs.length; j++) {
if (revProxy && revProxy[j]) {
const proxyVal = revProxy[j]()
objectMap.set(proxyVal, fnArgs[j])
fnArgs[j] = proxyVal
}
}
// TODO do we need to apply(this, ...) here?
return orig(...fnArgs)
}
}
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is this block trying to do here?

i have been reviewing this in the context of the dd-trace-api-js implementation, and everything in these handlers around this block makes sense to me (in terms of setting and grabbing the proxy obj value from self, and calling the mapped function on the tracer), but I'm unsure what this block is doing.

in what case would we check the arguments to the function against the objectMap? what is revProxy (i see on the publisher side & tests it's just defaulted to an empty list, which doesn't help here 😅)? why do we need to wrap a function argument (and, maybe it's arguments)?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's preserving the object map across all function calls that happen through our message bus (i.e. diagnostics channel). That means objects on the API side will never be real objects returned from real API calls.

We want to map these to "real" objects from the internal/dd-trace API so that a "real" value is never exposed to the user via dd-trace-api. I sometimes call this concept "the firewall". We need the firewall because we never want to expose an internal API to the user by accident, or else the whole premise of keeping the API separate just falls apart.

The object map has "dummy" objects as keys, and "real" objects as values. To maintain the firewall, we map them later on (see what we do later on in code with ret.value). When an API call is made the proxy property contains a function that returns a "dummy" object to map to. This is how we populate the object map.

Now, suppose one of these dummy objects, having already been mapped, is passed in to one of our APIs (take tracer.inject(span, carrier) as an example, where span was returned from one of our APIs so it's a mapped object). In order to make sure that the underlying "real API" only receives "real" objects, we need to map all of these through the object map. This would give us some nice code that looks something like this:

for (let i = 0; i < args.length; i++) {
  if (objectMap.has(args[i])) {
    args[i] = objectMap.get(args[i])
  }
}

That would preserve our firewall with respect to arguments passed to these API calls, and indeed, constitutes the first few lines you reference here.

Some of our API calls can take in callbacks. In this case, we need to do everything in reverse; instead of mapping return values, we need to map function arguments. A simple way to think of why this is necessary is to consider that callback arguments are effectively return values (google "continuation passing style"). An easy example API here is tracer.trace('name', options, () => { /* do stuff to trace */ }).

So, to deal with this in our loop, we first check if any given argument is a function. That means it's a callback so we have to jump into this reverse-mapping procedure. We replace the function with a wrapper function that passes any relevant args through relevant revProxy array functions, which need to give us dummy objects to pass through, much like proxy does for regular return values. We then map then in the object map, again, just like we do for return values. Then we're good to call the original/wrapped function.

In summary (and as a tl;dr, if you want), we have to map return values, and we have to unmap them when they're passed in as arguments. For callbacks, we have to do the reverse of those two operations. And that's about it.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This actually makes a lot of sense!! I think the overarching idea i wasn't connecting here was the whole "firewall" approach, which i get now in obfuscating the actual return values of things like span, scope, etc. from the tracer core api. i actually find that tracer.inject example a simple and understandable one for this, WRT passing these objects back to other tracer APIs!

it further helped seeing how it's actually implemented using defaultFun to signal this as the proxy value, and then through stuff like getSpan returns a dummy span with only the methods/properties we want to expose, and so on in dd-trace-api-js.

i think maybe a small comment here could help future readers like myself - maybe just linking to the explanation above?

Copy link
Collaborator

@sabrenner sabrenner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the approach and philosophy behind what's going on here makes sense to me!


const handleEvent = (name) => {
const counter = apiMetrics.count('dd_trace_api.called', [
`name:${name.replace(':', '.')}`,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this intentionally not .replaceAll()?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants