Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DI] Improve path matching algorithm for probe file paths #5166

Merged
merged 4 commits into from
Jan 31, 2025

Conversation

watson
Copy link
Collaborator

@watson watson commented Jan 29, 2025

What does this PR do?

Change the path matching algorithm used to match a probe file path against the list of loaded module paths. The new algorithm supports Windows paths and path prefixes and has fewer false positives.

Motivation

Ensure there's a higher chance of a match given the file path provided via RC.

The problem with path prefixes can occur if the user-provided file path contains a base directory, but this directory isn't part of the deployed code base. This would for example be the case if the files are stored in a sub-directory in the git repo (which is common for mono-repos), but deployed into a different directory in production. Or it can occur if the user copy-pastes the full local path from their development machine.

Plugin Checklist

Copy link
Collaborator Author

watson commented Jan 29, 2025

This stack of pull requests is managed by Graphite. Learn more about stacking.

Copy link

github-actions bot commented Jan 29, 2025

Overall package size

Self size: 8.55 MB
Deduped: 94.95 MB
No deduping: 95.46 MB

Dependency sizes | name | version | self size | total size | |------|---------|-----------|------------| | @datadog/libdatadog | 0.4.0 | 29.44 MB | 29.44 MB | | @datadog/native-appsec | 8.4.0 | 19.25 MB | 19.26 MB | | @datadog/native-iast-taint-tracking | 3.2.0 | 13.9 MB | 13.91 MB | | @datadog/pprof | 5.5.1 | 9.79 MB | 10.17 MB | | protobufjs | 7.2.5 | 2.77 MB | 5.16 MB | | @datadog/native-iast-rewriter | 2.6.1 | 2.59 MB | 2.73 MB | | @opentelemetry/core | 1.14.0 | 872.87 kB | 1.47 MB | | @datadog/native-metrics | 3.1.0 | 1.06 MB | 1.46 MB | | @opentelemetry/api | 1.8.0 | 1.21 MB | 1.21 MB | | import-in-the-middle | 1.11.2 | 112.74 kB | 826.22 kB | | source-map | 0.7.4 | 226 kB | 226 kB | | opentracing | 0.14.7 | 194.81 kB | 194.81 kB | | lru-cache | 7.18.3 | 133.92 kB | 133.92 kB | | pprof-format | 2.1.0 | 111.69 kB | 111.69 kB | | @datadog/sketches-js | 2.1.0 | 109.9 kB | 109.9 kB | | semver | 7.6.3 | 95.82 kB | 95.82 kB | | lodash.sortby | 4.7.0 | 75.76 kB | 75.76 kB | | ignore | 5.3.1 | 51.46 kB | 51.46 kB | | shell-quote | 1.8.1 | 44.96 kB | 44.96 kB | | istanbul-lib-coverage | 3.2.0 | 29.34 kB | 29.34 kB | | rfdc | 1.3.1 | 25.21 kB | 25.21 kB | | @isaacs/ttlcache | 1.4.1 | 25.2 kB | 25.2 kB | | tlhunter-sorted-set | 0.1.0 | 24.94 kB | 24.94 kB | | limiter | 1.1.5 | 23.17 kB | 23.17 kB | | dc-polyfill | 0.1.4 | 23.1 kB | 23.1 kB | | retry | 0.13.1 | 18.85 kB | 18.85 kB | | jest-docblock | 29.7.0 | 8.99 kB | 12.76 kB | | crypto-randomuuid | 1.0.0 | 11.18 kB | 11.18 kB | | ttl-set | 1.0.0 | 4.61 kB | 9.69 kB | | path-to-regexp | 0.1.12 | 6.6 kB | 6.6 kB | | koalas | 1.0.2 | 6.47 kB | 6.47 kB | | module-details-from-path | 1.0.3 | 4.47 kB | 4.47 kB |

🤖 This report was automatically generated by heaviest-objects-in-the-universe

Copy link

codecov bot commented Jan 29, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 81.07%. Comparing base (6f79a86) to head (1274ab1).
Report is 2 commits behind head on master.

Additional details and impacted files
@@           Coverage Diff           @@
##           master    #5166   +/-   ##
=======================================
  Coverage   81.07%   81.07%           
=======================================
  Files         479      479           
  Lines       21342    21342           
=======================================
  Hits        17303    17303           
  Misses       4039     4039           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@watson watson force-pushed the watson/DEBUG-3408/unknown-path-prefix branch from ea0dcd1 to c5883ec Compare January 29, 2025 13:21
@pr-commenter
Copy link

pr-commenter bot commented Jan 29, 2025

Benchmarks

Benchmark execution time: 2025-01-31 11:14:09

Comparing candidate commit 1274ab1 in PR branch watson/DEBUG-3408/unknown-path-prefix with baseline commit 6f79a86 in branch master.

Found 0 performance improvements and 1 performance regressions! Performance is the same for 909 metrics, 23 unstable metrics.

scenario:plugin-graphql-with-async-hooks-22

  • 🟥 max_rss_usage [+109.628MB; +121.180MB] or [+20.555%; +22.721%]

If there's no match for a known script based on the probe file path
provided via RC, try to remove the base directory and try to match
again. Continue this approach, until there's no more directories in the
path.

This can happen if the user-provided file path contains a base
directory, but this directory isn't part of the deployed code base.
This would for example be the case if the files are stored in a
sub-directory in the git repo, but deployed into a different directory
in production.
@watson watson force-pushed the watson/DEBUG-3408/unknown-path-prefix branch 2 times, most recently from 8f2feb8 to 60f46f9 Compare January 31, 2025 09:56
@datadog-datadog-prod-us1
Copy link

datadog-datadog-prod-us1 bot commented Jan 31, 2025

Datadog Report

Branch report: watson/DEBUG-3408/unknown-path-prefix
Commit report: f29a384
Test service: dd-trace-js-integration-tests

✅ 0 Failed, 616 Passed, 0 Skipped, 14m 48.67s Total Time

@watson watson force-pushed the watson/DEBUG-3408/unknown-path-prefix branch from 60f46f9 to b3ea372 Compare January 31, 2025 10:01
@watson watson changed the title [DI] Handle unknown path prefixes in probe file paths [DI] Improve path matching algorithm for probe file paths Jan 31, 2025
@watson watson requested a review from juan-fernandez January 31, 2025 11:35
* @param {string} path
* @returns {[string, string] | undefined}
* @param {string} path - Partial or absolute path to match against loaded scripts
* @returns {[string, string, string | undefined] | null} - Array containing [url, scriptId, sourceMapURL]
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change of return value is partly unrelated to this PR, but the JSDoc return value was out of date, so fixed it while I was here

@watson watson merged commit ccf1292 into master Jan 31, 2025
299 of 301 checks passed
@watson watson deleted the watson/DEBUG-3408/unknown-path-prefix branch January 31, 2025 12:36
watson added a commit that referenced this pull request Jan 31, 2025
Change the path matching algorithm used to match a probe file path against the
list of loaded module paths. The new algorithm supports Windows paths and path
prefixes and has fewer false positives.
@watson watson mentioned this pull request Jan 31, 2025
Copy link

@shatzi shatzi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good.


it('should abort if the path is undefined', testPathNoMatch(undefined))

it('should abort if the path is an empty string', testPathNoMatch(''))
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe also check '/' and '\' or 'c:' to ensure there are no errors.

watson added a commit that referenced this pull request Jan 31, 2025
Change the path matching algorithm used to match a probe file path against the
list of loaded module paths. The new algorithm supports Windows paths and path
prefixes and has fewer false positives.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants