Skip to content

Conversation

Yun-Kim
Copy link
Contributor

@Yun-Kim Yun-Kim commented Oct 16, 2025

Description

(public change) Adds reasoning as an argument to submit_evaluation_for() and submit_evaluation(). This arg is used to denote an explanation behind the evaluation results (i.e. why was the span marked as toxic?)

(internal change - not facing users) Also changes how the assessment field is stored on the evaluation object (#14792 added it as a nested success_criteria object) to a top-level field on the evaluation object. This isn't breaking (since this hasn't been officially released on our product backend) nor a user-facing change.

Testing

Risks

Additional Notes

@Yun-Kim Yun-Kim requested review from a team as code owners October 16, 2025 18:26
Copy link
Contributor

CODEOWNERS have been resolved as:

releasenotes/notes/feat-llmobs-submit-eval-reasoning-3f825d5dd257d4f8.yaml  @DataDog/apm-python
ddtrace/llmobs/_llmobs.py                                               @DataDog/ml-observability
ddtrace/llmobs/_writer.py                                               @DataDog/ml-observability
tests/llmobs/_utils.py                                                  @DataDog/ml-observability
tests/llmobs/test_llmobs_service.py                                     @DataDog/ml-observability

Copy link
Contributor

github-actions bot commented Oct 16, 2025

Bootstrap import analysis

Comparison of import times between this PR and base.

Summary

The average import time from this PR is: 243 ± 5 ms.

The average import time from base is: 247 ± 3 ms.

The import time difference between this PR and base is: -3.3 ± 0.2 ms.

Import time breakdown

The following import paths have shrunk:

ddtrace.auto 2.185 ms (0.90%)
ddtrace.bootstrap.sitecustomize 1.513 ms (0.62%)
ddtrace.bootstrap.preload 1.513 ms (0.62%)
ddtrace.internal.remoteconfig.client 0.715 ms (0.29%)
ddtrace 0.672 ms (0.28%)
ddtrace.internal._unpatched 0.026 ms (0.01%)
json 0.026 ms (0.01%)
json.decoder 0.026 ms (0.01%)
re 0.026 ms (0.01%)
enum 0.026 ms (0.01%)
types 0.026 ms (0.01%)

@pr-commenter
Copy link

pr-commenter bot commented Oct 16, 2025

Performance SLOs

Comparing candidate yunkim/llmobs-evals-assessment (dcaec5a) with baseline main (23fe9e1)

📈 Performance Regressions (1 suite)
📈 iastaspects - 118/118

✅ add_aspect

Time: ✅ 0.403µs (SLO: <10.000µs 📉 -96.0%) vs baseline: -1.2%

Memory: ✅ 37.631MB (SLO: <39.000MB -3.5%) vs baseline: +4.7%


✅ add_inplace_aspect

Time: ✅ 0.405µs (SLO: <10.000µs 📉 -95.9%) vs baseline: -1.1%

Memory: ✅ 37.709MB (SLO: <39.000MB -3.3%) vs baseline: +5.0%


✅ add_inplace_noaspect

Time: ✅ 0.314µs (SLO: <10.000µs 📉 -96.9%) vs baseline: ~same

Memory: ✅ 37.670MB (SLO: <39.000MB -3.4%) vs baseline: +5.0%


✅ add_noaspect

Time: ✅ 0.278µs (SLO: <10.000µs 📉 -97.2%) vs baseline: +0.1%

Memory: ✅ 37.690MB (SLO: <39.000MB -3.4%) vs baseline: +5.0%


✅ bytearray_aspect

Time: ✅ 1.325µs (SLO: <10.000µs 📉 -86.7%) vs baseline: -1.1%

Memory: ✅ 37.631MB (SLO: <39.000MB -3.5%) vs baseline: +4.7%


✅ bytearray_extend_aspect

Time: ✅ 1.455µs (SLO: <10.000µs 📉 -85.4%) vs baseline: +0.9%

Memory: ✅ 37.709MB (SLO: <39.000MB -3.3%) vs baseline: +4.7%


✅ bytearray_extend_noaspect

Time: ✅ 0.620µs (SLO: <10.000µs 📉 -93.8%) vs baseline: +0.9%

Memory: ✅ 37.709MB (SLO: <39.000MB -3.3%) vs baseline: +4.9%


✅ bytearray_noaspect

Time: ✅ 0.485µs (SLO: <10.000µs 📉 -95.1%) vs baseline: +0.4%

Memory: ✅ 37.690MB (SLO: <39.000MB -3.4%) vs baseline: +5.0%


✅ bytes_aspect

Time: ✅ 1.303µs (SLO: <10.000µs 📉 -87.0%) vs baseline: +1.6%

Memory: ✅ 37.670MB (SLO: <39.000MB -3.4%) vs baseline: +4.7%


✅ bytes_noaspect

Time: ✅ 0.494µs (SLO: <10.000µs 📉 -95.1%) vs baseline: -0.4%

Memory: ✅ 37.650MB (SLO: <39.000MB -3.5%) vs baseline: +4.9%


✅ bytesio_aspect

Time: ✅ 1.362µs (SLO: <10.000µs 📉 -86.4%) vs baseline: -0.2%

Memory: ✅ 37.650MB (SLO: <39.000MB -3.5%) vs baseline: +4.7%


✅ bytesio_noaspect

Time: ✅ 0.497µs (SLO: <10.000µs 📉 -95.0%) vs baseline: +0.5%

Memory: ✅ 37.729MB (SLO: <39.000MB -3.3%) vs baseline: +5.1%


✅ capitalize_aspect

Time: ✅ 0.729µs (SLO: <10.000µs 📉 -92.7%) vs baseline: -1.2%

Memory: ✅ 37.690MB (SLO: <39.000MB -3.4%) vs baseline: +5.0%


✅ capitalize_noaspect

Time: ✅ 0.434µs (SLO: <10.000µs 📉 -95.7%) vs baseline: -0.9%

Memory: ✅ 37.611MB (SLO: <39.000MB -3.6%) vs baseline: +4.8%


✅ casefold_aspect

Time: ✅ 0.735µs (SLO: <10.000µs 📉 -92.6%) vs baseline: ~same

Memory: ✅ 37.690MB (SLO: <39.000MB -3.4%) vs baseline: +4.9%


✅ casefold_noaspect

Time: ✅ 0.369µs (SLO: <10.000µs 📉 -96.3%) vs baseline: +0.2%

Memory: ✅ 37.709MB (SLO: <39.000MB -3.3%) vs baseline: +5.0%


✅ decode_aspect

Time: ✅ 0.723µs (SLO: <10.000µs 📉 -92.8%) vs baseline: -1.3%

Memory: ✅ 37.631MB (SLO: <39.000MB -3.5%) vs baseline: +4.7%


✅ decode_noaspect

Time: ✅ 0.421µs (SLO: <10.000µs 📉 -95.8%) vs baseline: -1.2%

Memory: ✅ 37.650MB (SLO: <39.000MB -3.5%) vs baseline: +4.9%


✅ encode_aspect

Time: ✅ 0.713µs (SLO: <10.000µs 📉 -92.9%) vs baseline: +0.9%

Memory: ✅ 37.670MB (SLO: <39.000MB -3.4%) vs baseline: +4.9%


✅ encode_noaspect

Time: ✅ 0.405µs (SLO: <10.000µs 📉 -95.9%) vs baseline: -0.2%

Memory: ✅ 37.749MB (SLO: <39.000MB -3.2%) vs baseline: +5.1%


✅ format_aspect

Time: ✅ 3.459µs (SLO: <10.000µs 📉 -65.4%) vs baseline: +2.1%

Memory: ✅ 37.690MB (SLO: <39.000MB -3.4%) vs baseline: +4.7%


✅ format_map_aspect

Time: ✅ 4.146µs (SLO: <10.000µs 📉 -58.5%) vs baseline: 📈 +13.9%

Memory: ✅ 37.690MB (SLO: <39.000MB -3.4%) vs baseline: +4.9%


✅ format_map_noaspect

Time: ✅ 0.776µs (SLO: <10.000µs 📉 -92.2%) vs baseline: -0.2%

Memory: ✅ 37.709MB (SLO: <39.000MB -3.3%) vs baseline: +5.0%


✅ format_noaspect

Time: ✅ 0.595µs (SLO: <10.000µs 📉 -94.1%) vs baseline: -0.4%

Memory: ✅ 37.690MB (SLO: <39.000MB -3.4%) vs baseline: +4.7%


✅ index_aspect

Time: ✅ 0.353µs (SLO: <10.000µs 📉 -96.5%) vs baseline: -1.2%

Memory: ✅ 37.650MB (SLO: <39.000MB -3.5%) vs baseline: +4.9%


✅ index_noaspect

Time: ✅ 0.278µs (SLO: <10.000µs 📉 -97.2%) vs baseline: -0.3%

Memory: ✅ 37.670MB (SLO: <39.000MB -3.4%) vs baseline: +4.9%


✅ join_aspect

Time: ✅ 1.387µs (SLO: <10.000µs 📉 -86.1%) vs baseline: -1.1%

Memory: ✅ 37.709MB (SLO: <39.000MB -3.3%) vs baseline: +5.0%


✅ join_noaspect

Time: ✅ 0.496µs (SLO: <10.000µs 📉 -95.0%) vs baseline: +1.2%

Memory: ✅ 37.670MB (SLO: <39.000MB -3.4%) vs baseline: +4.7%


✅ ljust_aspect

Time: ✅ 2.486µs (SLO: <20.000µs 📉 -87.6%) vs baseline: -0.7%

Memory: ✅ 37.611MB (SLO: <39.000MB -3.6%) vs baseline: +4.6%


✅ ljust_noaspect

Time: ✅ 0.405µs (SLO: <10.000µs 📉 -96.0%) vs baseline: ~same

Memory: ✅ 37.729MB (SLO: <39.000MB -3.3%) vs baseline: +5.0%


✅ lower_aspect

Time: ✅ 2.234µs (SLO: <10.000µs 📉 -77.7%) vs baseline: +0.5%

Memory: ✅ 37.611MB (SLO: <39.000MB -3.6%) vs baseline: +4.7%


✅ lower_noaspect

Time: ✅ 0.368µs (SLO: <10.000µs 📉 -96.3%) vs baseline: -1.2%

Memory: ✅ 37.670MB (SLO: <39.000MB -3.4%) vs baseline: +4.7%


✅ lstrip_aspect

Time: ✅ 2.242µs (SLO: <20.000µs 📉 -88.8%) vs baseline: +0.2%

Memory: ✅ 37.690MB (SLO: <39.000MB -3.4%) vs baseline: +4.9%


✅ lstrip_noaspect

Time: ✅ 0.378µs (SLO: <10.000µs 📉 -96.2%) vs baseline: -1.1%

Memory: ✅ 37.670MB (SLO: <39.000MB -3.4%) vs baseline: +4.9%


✅ modulo_aspect

Time: ✅ 0.995µs (SLO: <10.000µs 📉 -90.0%) vs baseline: -1.3%

Memory: ✅ 37.729MB (SLO: <39.000MB -3.3%) vs baseline: +4.9%


✅ modulo_aspect_for_bytearray_bytearray

Time: ✅ 1.546µs (SLO: <10.000µs 📉 -84.5%) vs baseline: +0.2%

Memory: ✅ 37.670MB (SLO: <39.000MB -3.4%) vs baseline: +4.9%


✅ modulo_aspect_for_bytes

Time: ✅ 0.988µs (SLO: <10.000µs 📉 -90.1%) vs baseline: +0.8%

Memory: ✅ 37.591MB (SLO: <39.000MB -3.6%) vs baseline: +4.6%


✅ modulo_aspect_for_bytes_bytearray

Time: ✅ 1.204µs (SLO: <10.000µs 📉 -88.0%) vs baseline: +0.1%

Memory: ✅ 37.690MB (SLO: <39.000MB -3.4%) vs baseline: +5.0%


✅ modulo_noaspect

Time: ✅ 0.629µs (SLO: <10.000µs 📉 -93.7%) vs baseline: -0.1%

Memory: ✅ 37.690MB (SLO: <39.000MB -3.4%) vs baseline: +4.9%


✅ replace_aspect

Time: ✅ 4.793µs (SLO: <10.000µs 📉 -52.1%) vs baseline: -0.1%

Memory: ✅ 37.690MB (SLO: <39.000MB -3.4%) vs baseline: +4.7%


✅ replace_noaspect

Time: ✅ 0.462µs (SLO: <10.000µs 📉 -95.4%) vs baseline: -0.1%

Memory: ✅ 37.572MB (SLO: <39.000MB -3.7%) vs baseline: +4.6%


✅ repr_aspect

Time: ✅ 0.903µs (SLO: <10.000µs 📉 -91.0%) vs baseline: +0.1%

Memory: ✅ 37.690MB (SLO: <39.000MB -3.4%) vs baseline: +4.9%


✅ repr_noaspect

Time: ✅ 0.416µs (SLO: <10.000µs 📉 -95.8%) vs baseline: +0.8%

Memory: ✅ 37.709MB (SLO: <39.000MB -3.3%) vs baseline: +5.0%


✅ rstrip_aspect

Time: ✅ 1.907µs (SLO: <20.000µs 📉 -90.5%) vs baseline: -0.8%

Memory: ✅ 37.670MB (SLO: <39.000MB -3.4%) vs baseline: +5.0%


✅ rstrip_noaspect

Time: ✅ 0.380µs (SLO: <10.000µs 📉 -96.2%) vs baseline: +0.3%

Memory: ✅ 37.650MB (SLO: <39.000MB -3.5%) vs baseline: +4.8%


✅ slice_aspect

Time: ✅ 0.495µs (SLO: <10.000µs 📉 -95.1%) vs baseline: -0.2%

Memory: ✅ 37.611MB (SLO: <39.000MB -3.6%) vs baseline: +4.7%


✅ slice_noaspect

Time: ✅ 0.447µs (SLO: <10.000µs 📉 -95.5%) vs baseline: -0.6%

Memory: ✅ 37.670MB (SLO: <39.000MB -3.4%) vs baseline: +4.9%


✅ stringio_aspect

Time: ✅ 1.529µs (SLO: <10.000µs 📉 -84.7%) vs baseline: ~same

Memory: ✅ 37.690MB (SLO: <39.000MB -3.4%) vs baseline: +4.9%


✅ stringio_noaspect

Time: ✅ 0.718µs (SLO: <10.000µs 📉 -92.8%) vs baseline: -0.7%

Memory: ✅ 37.690MB (SLO: <39.000MB -3.4%) vs baseline: +4.9%


✅ strip_aspect

Time: ✅ 2.205µs (SLO: <20.000µs 📉 -89.0%) vs baseline: -0.5%

Memory: ✅ 37.631MB (SLO: <39.000MB -3.5%) vs baseline: +4.9%


✅ strip_noaspect

Time: ✅ 0.387µs (SLO: <10.000µs 📉 -96.1%) vs baseline: +1.0%

Memory: ✅ 37.709MB (SLO: <39.000MB -3.3%) vs baseline: +4.8%


✅ swapcase_aspect

Time: ✅ 2.526µs (SLO: <10.000µs 📉 -74.7%) vs baseline: +4.7%

Memory: ✅ 37.709MB (SLO: <39.000MB -3.3%) vs baseline: +4.7%


✅ swapcase_noaspect

Time: ✅ 0.538µs (SLO: <10.000µs 📉 -94.6%) vs baseline: +0.8%

Memory: ✅ 37.650MB (SLO: <39.000MB -3.5%) vs baseline: +4.8%


✅ title_aspect

Time: ✅ 2.441µs (SLO: <10.000µs 📉 -75.6%) vs baseline: +3.8%

Memory: ✅ 37.650MB (SLO: <39.000MB -3.5%) vs baseline: +4.7%


✅ title_noaspect

Time: ✅ 0.505µs (SLO: <10.000µs 📉 -95.0%) vs baseline: +0.7%

Memory: ✅ 37.670MB (SLO: <39.000MB -3.4%) vs baseline: +4.8%


✅ translate_aspect

Time: ✅ 3.321µs (SLO: <10.000µs 📉 -66.8%) vs baseline: +2.8%

Memory: ✅ 37.650MB (SLO: <39.000MB -3.5%) vs baseline: +4.6%


✅ translate_noaspect

Time: ✅ 1.041µs (SLO: <10.000µs 📉 -89.6%) vs baseline: -0.3%

Memory: ✅ 37.690MB (SLO: <39.000MB -3.4%) vs baseline: +5.0%


✅ upper_aspect

Time: ✅ 2.236µs (SLO: <10.000µs 📉 -77.6%) vs baseline: -1.0%

Memory: ✅ 37.690MB (SLO: <39.000MB -3.4%) vs baseline: +4.9%


✅ upper_noaspect

Time: ✅ 0.369µs (SLO: <10.000µs 📉 -96.3%) vs baseline: -0.7%

Memory: ✅ 37.670MB (SLO: <39.000MB -3.4%) vs baseline: +4.9%

🟡 Near SLO Breach (5 suites)
🟡 djangosimple - 30/30

✅ appsec

Time: ✅ 20.475ms (SLO: <22.300ms -8.2%) vs baseline: ~same

Memory: ✅ 65.447MB (SLO: <67.000MB -2.3%) vs baseline: +4.8%


✅ exception-replay-enabled

Time: ✅ 1.348ms (SLO: <1.450ms -7.0%) vs baseline: -0.1%

Memory: ✅ 64.640MB (SLO: <67.000MB -3.5%) vs baseline: +4.9%


✅ iast

Time: ✅ 20.428ms (SLO: <22.250ms -8.2%) vs baseline: -0.3%

Memory: ✅ 65.476MB (SLO: <67.000MB -2.3%) vs baseline: +5.0%


✅ profiler

Time: ✅ 15.300ms (SLO: <16.550ms -7.6%) vs baseline: -0.3%

Memory: ✅ 53.732MB (SLO: <54.500MB 🟡 -1.4%) vs baseline: +5.0%


✅ resource-renaming

Time: ✅ 20.547ms (SLO: <21.750ms -5.5%) vs baseline: +0.1%

Memory: ✅ 65.482MB (SLO: <67.000MB -2.3%) vs baseline: +4.9%


✅ span-code-origin

Time: ✅ 26.198ms (SLO: <28.200ms -7.1%) vs baseline: -0.2%

Memory: ✅ 67.633MB (SLO: <69.500MB -2.7%) vs baseline: +4.8%


✅ tracer

Time: ✅ 20.527ms (SLO: <21.750ms -5.6%) vs baseline: ~same

Memory: ✅ 65.363MB (SLO: <67.000MB -2.4%) vs baseline: +4.8%


✅ tracer-and-profiler

Time: ✅ 22.041ms (SLO: <23.500ms -6.2%) vs baseline: ~same

Memory: ✅ 66.542MB (SLO: <67.500MB 🟡 -1.4%) vs baseline: +4.7%


✅ tracer-dont-create-db-spans

Time: ✅ 19.311ms (SLO: <21.500ms 📉 -10.2%) vs baseline: -0.2%

Memory: ✅ 65.460MB (SLO: <66.000MB 🟡 -0.8%) vs baseline: +4.9%


✅ tracer-minimal

Time: ✅ 16.654ms (SLO: <17.500ms -4.8%) vs baseline: +0.1%

Memory: ✅ 65.397MB (SLO: <66.000MB 🟡 -0.9%) vs baseline: +4.8%


✅ tracer-native

Time: ✅ 20.462ms (SLO: <21.750ms -5.9%) vs baseline: -0.2%

Memory: ✅ 71.360MB (SLO: <72.500MB 🟡 -1.6%) vs baseline: +4.8%


✅ tracer-no-caches

Time: ✅ 18.471ms (SLO: <19.650ms -6.0%) vs baseline: -0.4%

Memory: ✅ 65.406MB (SLO: <67.000MB -2.4%) vs baseline: +4.8%


✅ tracer-no-databases

Time: ✅ 18.773ms (SLO: <20.100ms -6.6%) vs baseline: ~same

Memory: ✅ 65.258MB (SLO: <67.000MB -2.6%) vs baseline: +4.7%


✅ tracer-no-middleware

Time: ✅ 20.155ms (SLO: <21.500ms -6.3%) vs baseline: -0.4%

Memory: ✅ 65.431MB (SLO: <67.000MB -2.3%) vs baseline: +4.8%


✅ tracer-no-templates

Time: ✅ 20.293ms (SLO: <22.000ms -7.8%) vs baseline: -0.3%

Memory: ✅ 65.457MB (SLO: <67.000MB -2.3%) vs baseline: +5.0%


🟡 errortrackingdjangosimple - 6/6

✅ errortracking-enabled-all

Time: ✅ 18.052ms (SLO: <19.850ms -9.1%) vs baseline: -0.2%

Memory: ✅ 65.254MB (SLO: <66.500MB 🟡 -1.9%) vs baseline: +4.9%


✅ errortracking-enabled-user

Time: ✅ 18.114ms (SLO: <19.400ms -6.6%) vs baseline: +0.5%

Memory: ✅ 65.274MB (SLO: <66.500MB 🟡 -1.8%) vs baseline: +4.9%


✅ tracer-enabled

Time: ✅ 18.245ms (SLO: <19.450ms -6.2%) vs baseline: +0.8%

Memory: ✅ 65.235MB (SLO: <66.500MB 🟡 -1.9%) vs baseline: +4.9%


🟡 flasksimple - 18/18

✅ appsec-get

Time: ✅ 4.590ms (SLO: <4.750ms -3.4%) vs baseline: ~same

Memory: ✅ 62.030MB (SLO: <65.000MB -4.6%) vs baseline: +5.3%


✅ appsec-post

Time: ✅ 6.606ms (SLO: <6.750ms -2.1%) vs baseline: -0.2%

Memory: ✅ 61.991MB (SLO: <65.000MB -4.6%) vs baseline: +4.5%


✅ appsec-telemetry

Time: ✅ 4.584ms (SLO: <4.750ms -3.5%) vs baseline: -0.5%

Memory: ✅ 61.991MB (SLO: <65.000MB -4.6%) vs baseline: +4.8%


✅ debugger

Time: ✅ 1.860ms (SLO: <2.000ms -7.0%) vs baseline: +0.4%

Memory: ✅ 45.554MB (SLO: <47.000MB -3.1%) vs baseline: +5.1%


✅ iast-get

Time: ✅ 1.865ms (SLO: <2.000ms -6.8%) vs baseline: ~same

Memory: ✅ 42.389MB (SLO: <49.000MB 📉 -13.5%) vs baseline: +5.1%


✅ profiler

Time: ✅ 1.914ms (SLO: <2.100ms -8.8%) vs baseline: -0.2%

Memory: ✅ 46.475MB (SLO: <47.000MB 🟡 -1.1%) vs baseline: +4.7%


✅ resource-renaming

Time: ✅ 3.369ms (SLO: <3.650ms -7.7%) vs baseline: -0.4%

Memory: ✅ 52.180MB (SLO: <53.500MB -2.5%) vs baseline: +4.7%


✅ tracer

Time: ✅ 3.364ms (SLO: <3.650ms -7.8%) vs baseline: ~same

Memory: ✅ 52.278MB (SLO: <53.500MB -2.3%) vs baseline: +5.0%


✅ tracer-native

Time: ✅ 3.370ms (SLO: <3.650ms -7.7%) vs baseline: +0.2%

Memory: ✅ 58.326MB (SLO: <60.000MB -2.8%) vs baseline: +5.3%


🟡 otelspan - 22/22

✅ add-event

Time: ✅ 41.982ms (SLO: <47.150ms 📉 -11.0%) vs baseline: +2.3%

Memory: ✅ 44.211MB (SLO: <47.000MB -5.9%) vs baseline: +5.1%


✅ add-metrics

Time: ✅ 315.746ms (SLO: <344.800ms -8.4%) vs baseline: -0.8%

Memory: ✅ 617.214MB (SLO: <630.000MB -2.0%) vs baseline: +4.8%


✅ add-tags

Time: ✅ 292.387ms (SLO: <314.000ms -6.9%) vs baseline: +1.0%

Memory: ✅ 618.885MB (SLO: <630.000MB 🟡 -1.8%) vs baseline: +4.8%


✅ get-context

Time: ✅ 80.819ms (SLO: <92.350ms 📉 -12.5%) vs baseline: -0.3%

Memory: ✅ 39.737MB (SLO: <46.500MB 📉 -14.5%) vs baseline: +4.8%


✅ is-recording

Time: ✅ 39.061ms (SLO: <44.500ms 📉 -12.2%) vs baseline: +1.8%

Memory: ✅ 43.639MB (SLO: <47.500MB -8.1%) vs baseline: +4.8%


✅ record-exception

Time: ✅ 59.562ms (SLO: <67.650ms 📉 -12.0%) vs baseline: +1.8%

Memory: ✅ 40.126MB (SLO: <47.000MB 📉 -14.6%) vs baseline: +4.9%


✅ set-status

Time: ✅ 44.382ms (SLO: <50.400ms 📉 -11.9%) vs baseline: +0.5%

Memory: ✅ 43.578MB (SLO: <47.000MB -7.3%) vs baseline: +4.8%


✅ start

Time: ✅ 37.949ms (SLO: <43.450ms 📉 -12.7%) vs baseline: +0.9%

Memory: ✅ 43.591MB (SLO: <47.000MB -7.3%) vs baseline: +4.8%


✅ start-finish

Time: ✅ 83.017ms (SLO: <88.000ms -5.7%) vs baseline: +0.2%

Memory: ✅ 34.524MB (SLO: <46.500MB 📉 -25.8%) vs baseline: +4.7%


✅ start-finish-telemetry

Time: ✅ 85.732ms (SLO: <89.000ms -3.7%) vs baseline: +1.9%

Memory: ✅ 34.544MB (SLO: <46.500MB 📉 -25.7%) vs baseline: +5.0%


✅ update-name

Time: ✅ 39.373ms (SLO: <45.150ms 📉 -12.8%) vs baseline: +0.4%

Memory: ✅ 43.889MB (SLO: <47.000MB -6.6%) vs baseline: +4.8%


🟡 span - 26/26

✅ add-event

Time: ✅ 20.554ms (SLO: <22.500ms -8.6%) vs baseline: +0.1%

Memory: ✅ 49.678MB (SLO: <53.000MB -6.3%) vs baseline: +4.9%


✅ add-metrics

Time: ✅ 90.681ms (SLO: <93.500ms -3.0%) vs baseline: +0.7%

Memory: ✅ 689.975MB (SLO: <961.000MB 📉 -28.2%) vs baseline: +4.8%


✅ add-tags

Time: ✅ 149.119ms (SLO: <155.000ms -3.8%) vs baseline: +0.7%

Memory: ✅ 690.822MB (SLO: <962.500MB 📉 -28.2%) vs baseline: +4.8%


✅ get-context

Time: ✅ 18.965ms (SLO: <20.500ms -7.5%) vs baseline: +1.5%

Memory: ✅ 48.436MB (SLO: <53.000MB -8.6%) vs baseline: +5.0%


✅ is-recording

Time: ✅ 18.927ms (SLO: <20.500ms -7.7%) vs baseline: -0.2%

Memory: ✅ 48.474MB (SLO: <53.000MB -8.5%) vs baseline: +4.9%


✅ record-exception

Time: ✅ 37.919ms (SLO: <40.000ms -5.2%) vs baseline: +0.3%

Memory: ✅ 42.453MB (SLO: <53.000MB 📉 -19.9%) vs baseline: +4.7%


✅ set-status

Time: ✅ 20.773ms (SLO: <22.000ms -5.6%) vs baseline: -0.2%

Memory: ✅ 48.456MB (SLO: <53.000MB -8.6%) vs baseline: +4.9%


✅ start

Time: ✅ 18.685ms (SLO: <20.500ms -8.9%) vs baseline: -0.2%

Memory: ✅ 48.403MB (SLO: <53.000MB -8.7%) vs baseline: +4.7%


✅ start-finish

Time: ✅ 51.572ms (SLO: <52.500ms 🟡 -1.8%) vs baseline: ~same

Memory: ✅ 32.185MB (SLO: <34.000MB -5.3%) vs baseline: +4.8%


✅ start-finish-telemetry

Time: ✅ 52.788ms (SLO: <54.500ms -3.1%) vs baseline: +0.3%

Memory: ✅ 32.204MB (SLO: <34.000MB -5.3%) vs baseline: +5.0%


✅ start-finish-traceid128

Time: ✅ 54.787ms (SLO: <57.000ms -3.9%) vs baseline: -0.2%

Memory: ✅ 32.126MB (SLO: <34.000MB -5.5%) vs baseline: +4.9%


✅ start-traceid128

Time: ✅ 19.013ms (SLO: <22.500ms 📉 -15.5%) vs baseline: -0.2%

Memory: ✅ 48.390MB (SLO: <53.000MB -8.7%) vs baseline: +4.9%


✅ update-name

Time: ✅ 19.558ms (SLO: <22.000ms 📉 -11.1%) vs baseline: +1.7%

Memory: ✅ 49.122MB (SLO: <53.000MB -7.3%) vs baseline: +4.8%

⚠️ Unstable Tests (1 suite)
⚠️ coreapiscenario - 10/10 (1 unstable)

⚠️ context_with_data_listeners

Time: ⚠️ 13.334µs (SLO: <20.000µs 📉 -33.3%) vs baseline: +0.7%

Memory: ✅ 32.165MB (SLO: <33.500MB -4.0%) vs baseline: +4.9%


✅ context_with_data_no_listeners

Time: ✅ 3.274µs (SLO: <10.000µs 📉 -67.3%) vs baseline: -0.2%

Memory: ✅ 32.126MB (SLO: <33.500MB -4.1%) vs baseline: +4.9%


✅ get_item_exists

Time: ✅ 0.583µs (SLO: <10.000µs 📉 -94.2%) vs baseline: -0.1%

Memory: ✅ 32.047MB (SLO: <33.500MB -4.3%) vs baseline: +4.8%


✅ get_item_missing

Time: ✅ 0.642µs (SLO: <10.000µs 📉 -93.6%) vs baseline: -1.2%

Memory: ✅ 32.145MB (SLO: <33.500MB -4.0%) vs baseline: +5.0%


✅ set_item

Time: ✅ 24.491µs (SLO: <30.000µs 📉 -18.4%) vs baseline: +1.4%

Memory: ✅ 32.126MB (SLO: <33.500MB -4.1%) vs baseline: +5.1%

✅ All Tests Passing (17 suites)
errortrackingflasksqli - 6/6

✅ errortracking-enabled-all

Time: ✅ 2.077ms (SLO: <2.300ms -9.7%) vs baseline: +0.1%

Memory: ✅ 52.081MB (SLO: <53.500MB -2.7%) vs baseline: +4.8%


✅ errortracking-enabled-user

Time: ✅ 2.076ms (SLO: <2.250ms -7.7%) vs baseline: +0.2%

Memory: ✅ 52.140MB (SLO: <53.500MB -2.5%) vs baseline: +4.8%


✅ tracer-enabled

Time: ✅ 2.091ms (SLO: <2.300ms -9.1%) vs baseline: +1.0%

Memory: ✅ 52.101MB (SLO: <53.500MB -2.6%) vs baseline: +4.7%


flasksqli - 6/6

✅ appsec-enabled

Time: ✅ 3.955ms (SLO: <4.200ms -5.8%) vs baseline: +0.4%

Memory: ✅ 62.325MB (SLO: <66.000MB -5.6%) vs baseline: +4.9%


✅ iast-enabled

Time: ✅ 2.430ms (SLO: <2.800ms 📉 -13.2%) vs baseline: -0.7%

Memory: ✅ 58.629MB (SLO: <60.000MB -2.3%) vs baseline: +4.8%


✅ tracer-enabled

Time: ✅ 2.059ms (SLO: <2.250ms -8.5%) vs baseline: ~same

Memory: ✅ 52.101MB (SLO: <54.500MB -4.4%) vs baseline: +4.6%


httppropagationextract - 60/60

✅ all_styles_all_headers

Time: ✅ 81.238µs (SLO: <100.000µs 📉 -18.8%) vs baseline: -0.9%

Memory: ✅ 32.204MB (SLO: <33.500MB -3.9%) vs baseline: +5.1%


✅ b3_headers

Time: ✅ 14.288µs (SLO: <20.000µs 📉 -28.6%) vs baseline: +1.0%

Memory: ✅ 32.145MB (SLO: <33.500MB -4.0%) vs baseline: +4.7%


✅ b3_single_headers

Time: ✅ 13.315µs (SLO: <20.000µs 📉 -33.4%) vs baseline: ~same

Memory: ✅ 32.185MB (SLO: <33.500MB -3.9%) vs baseline: +5.1%


✅ datadog_tracecontext_tracestate_not_propagated_on_trace_id_no_match

Time: ✅ 63.446µs (SLO: <80.000µs 📉 -20.7%) vs baseline: -0.7%

Memory: ✅ 32.204MB (SLO: <33.500MB -3.9%) vs baseline: +4.9%


✅ datadog_tracecontext_tracestate_propagated_on_trace_id_match

Time: ✅ 66.136µs (SLO: <80.000µs 📉 -17.3%) vs baseline: ~same

Memory: ✅ 32.126MB (SLO: <33.500MB -4.1%) vs baseline: +4.6%


✅ empty_headers

Time: ✅ 1.592µs (SLO: <10.000µs 📉 -84.1%) vs baseline: -0.5%

Memory: ✅ 32.165MB (SLO: <33.500MB -4.0%) vs baseline: +5.0%


✅ full_t_id_datadog_headers

Time: ✅ 22.768µs (SLO: <30.000µs 📉 -24.1%) vs baseline: -1.0%

Memory: ✅ 32.185MB (SLO: <33.500MB -3.9%) vs baseline: +4.9%


✅ invalid_priority_header

Time: ✅ 6.596µs (SLO: <10.000µs 📉 -34.0%) vs baseline: +1.6%

Memory: ✅ 32.126MB (SLO: <33.500MB -4.1%) vs baseline: +4.8%


✅ invalid_span_id_header

Time: ✅ 6.532µs (SLO: <10.000µs 📉 -34.7%) vs baseline: +0.2%

Memory: ✅ 32.126MB (SLO: <33.500MB -4.1%) vs baseline: +4.7%


✅ invalid_tags_header

Time: ✅ 6.507µs (SLO: <10.000µs 📉 -34.9%) vs baseline: -0.2%

Memory: ✅ 32.185MB (SLO: <33.500MB -3.9%) vs baseline: +5.0%


✅ invalid_trace_id_header

Time: ✅ 6.520µs (SLO: <10.000µs 📉 -34.8%) vs baseline: -0.2%

Memory: ✅ 32.165MB (SLO: <33.500MB -4.0%) vs baseline: +4.7%


✅ large_header_no_matches

Time: ✅ 27.609µs (SLO: <30.000µs -8.0%) vs baseline: +0.5%

Memory: ✅ 32.204MB (SLO: <33.500MB -3.9%) vs baseline: +5.2%


✅ large_valid_headers_all

Time: ✅ 28.591µs (SLO: <40.000µs 📉 -28.5%) vs baseline: -0.1%

Memory: ✅ 32.165MB (SLO: <33.500MB -4.0%) vs baseline: +4.9%


✅ medium_header_no_matches

Time: ✅ 9.763µs (SLO: <20.000µs 📉 -51.2%) vs baseline: -1.5%

Memory: ✅ 32.185MB (SLO: <33.500MB -3.9%) vs baseline: +4.9%


✅ medium_valid_headers_all

Time: ✅ 11.269µs (SLO: <20.000µs 📉 -43.7%) vs baseline: +0.4%

Memory: ✅ 32.185MB (SLO: <33.500MB -3.9%) vs baseline: +4.7%


✅ none_propagation_style

Time: ✅ 1.695µs (SLO: <10.000µs 📉 -83.0%) vs baseline: +0.8%

Memory: ✅ 32.145MB (SLO: <33.500MB -4.0%) vs baseline: +4.9%


✅ tracecontext_headers

Time: ✅ 34.407µs (SLO: <40.000µs 📉 -14.0%) vs baseline: ~same

Memory: ✅ 32.086MB (SLO: <33.500MB -4.2%) vs baseline: +4.7%


✅ valid_headers_all

Time: ✅ 6.546µs (SLO: <10.000µs 📉 -34.5%) vs baseline: +0.3%

Memory: ✅ 32.086MB (SLO: <33.500MB -4.2%) vs baseline: +4.5%


✅ valid_headers_basic

Time: ✅ 6.088µs (SLO: <10.000µs 📉 -39.1%) vs baseline: +0.5%

Memory: ✅ 32.145MB (SLO: <33.500MB -4.0%) vs baseline: +4.7%


✅ wsgi_empty_headers

Time: ✅ 1.593µs (SLO: <10.000µs 📉 -84.1%) vs baseline: -0.4%

Memory: ✅ 32.145MB (SLO: <33.500MB -4.0%) vs baseline: +4.6%


✅ wsgi_invalid_priority_header

Time: ✅ 6.579µs (SLO: <10.000µs 📉 -34.2%) vs baseline: +0.6%

Memory: ✅ 32.204MB (SLO: <33.500MB -3.9%) vs baseline: +5.1%


✅ wsgi_invalid_span_id_header

Time: ✅ 1.591µs (SLO: <10.000µs 📉 -84.1%) vs baseline: +0.9%

Memory: ✅ 32.126MB (SLO: <33.500MB -4.1%) vs baseline: +4.5%


✅ wsgi_invalid_tags_header

Time: ✅ 6.559µs (SLO: <10.000µs 📉 -34.4%) vs baseline: ~same

Memory: ✅ 32.165MB (SLO: <33.500MB -4.0%) vs baseline: +5.0%


✅ wsgi_invalid_trace_id_header

Time: ✅ 6.573µs (SLO: <10.000µs 📉 -34.3%) vs baseline: -0.4%

Memory: ✅ 32.244MB (SLO: <33.500MB -3.8%) vs baseline: +5.0%


✅ wsgi_large_header_no_matches

Time: ✅ 28.653µs (SLO: <40.000µs 📉 -28.4%) vs baseline: +0.2%

Memory: ✅ 32.204MB (SLO: <33.500MB -3.9%) vs baseline: +5.0%


✅ wsgi_large_valid_headers_all

Time: ✅ 29.858µs (SLO: <40.000µs 📉 -25.4%) vs baseline: +0.5%

Memory: ✅ 32.145MB (SLO: <33.500MB -4.0%) vs baseline: +4.9%


✅ wsgi_medium_header_no_matches

Time: ✅ 10.090µs (SLO: <20.000µs 📉 -49.6%) vs baseline: -0.1%

Memory: ✅ 32.204MB (SLO: <33.500MB -3.9%) vs baseline: +5.2%


✅ wsgi_medium_valid_headers_all

Time: ✅ 11.566µs (SLO: <20.000µs 📉 -42.2%) vs baseline: +1.0%

Memory: ✅ 32.204MB (SLO: <33.500MB -3.9%) vs baseline: +5.1%


✅ wsgi_valid_headers_all

Time: ✅ 6.572µs (SLO: <10.000µs 📉 -34.3%) vs baseline: +0.7%

Memory: ✅ 32.106MB (SLO: <33.500MB -4.2%) vs baseline: +4.5%


✅ wsgi_valid_headers_basic

Time: ✅ 6.104µs (SLO: <10.000µs 📉 -39.0%) vs baseline: +0.2%

Memory: ✅ 32.165MB (SLO: <33.500MB -4.0%) vs baseline: +4.8%


httppropagationinject - 16/16

✅ ids_only

Time: ✅ 21.704µs (SLO: <30.000µs 📉 -27.7%) vs baseline: ~same

Memory: ✅ 32.145MB (SLO: <33.500MB -4.0%) vs baseline: +4.7%


✅ with_all

Time: ✅ 28.794µs (SLO: <40.000µs 📉 -28.0%) vs baseline: ~same

Memory: ✅ 32.185MB (SLO: <33.500MB -3.9%) vs baseline: +4.7%


✅ with_dd_origin

Time: ✅ 25.444µs (SLO: <30.000µs 📉 -15.2%) vs baseline: +0.3%

Memory: ✅ 32.204MB (SLO: <33.500MB -3.9%) vs baseline: +5.1%


✅ with_priority_and_origin

Time: ✅ 24.782µs (SLO: <40.000µs 📉 -38.0%) vs baseline: ~same

Memory: ✅ 32.165MB (SLO: <33.500MB -4.0%) vs baseline: +4.8%


✅ with_sampling_priority

Time: ✅ 22.444µs (SLO: <30.000µs 📉 -25.2%) vs baseline: +3.9%

Memory: ✅ 32.185MB (SLO: <33.500MB -3.9%) vs baseline: +5.0%


✅ with_tags

Time: ✅ 28.297µs (SLO: <40.000µs 📉 -29.3%) vs baseline: +4.7%

Memory: ✅ 32.185MB (SLO: <33.500MB -3.9%) vs baseline: +5.1%


✅ with_tags_invalid

Time: ✅ 29.389µs (SLO: <40.000µs 📉 -26.5%) vs baseline: +2.9%

Memory: ✅ 32.185MB (SLO: <33.500MB -3.9%) vs baseline: +4.8%


✅ with_tags_max_size

Time: ✅ 27.531µs (SLO: <40.000µs 📉 -31.2%) vs baseline: +0.3%

Memory: ✅ 32.244MB (SLO: <33.500MB -3.8%) vs baseline: +5.0%


iast_aspects - 40/40

✅ re_expand_aspect

Time: ✅ 31.857µs (SLO: <40.000µs 📉 -20.4%) vs baseline: -0.7%

Memory: ✅ 37.729MB (SLO: <39.000MB -3.3%) vs baseline: +5.0%


✅ re_expand_noaspect

Time: ✅ 28.759µs (SLO: <40.000µs 📉 -28.1%) vs baseline: +0.2%

Memory: ✅ 37.690MB (SLO: <39.000MB -3.4%) vs baseline: +5.1%


✅ re_findall_aspect

Time: ✅ 2.916µs (SLO: <10.000µs 📉 -70.8%) vs baseline: ~same

Memory: ✅ 37.670MB (SLO: <39.000MB -3.4%) vs baseline: +4.8%


✅ re_findall_noaspect

Time: ✅ 1.415µs (SLO: <10.000µs 📉 -85.8%) vs baseline: -0.4%

Memory: ✅ 37.650MB (SLO: <39.000MB -3.5%) vs baseline: +4.7%


✅ re_finditer_aspect

Time: ✅ 4.485µs (SLO: <10.000µs 📉 -55.2%) vs baseline: ~same

Memory: ✅ 37.670MB (SLO: <39.000MB -3.4%) vs baseline: +4.7%


✅ re_finditer_noaspect

Time: ✅ 1.404µs (SLO: <10.000µs 📉 -86.0%) vs baseline: -0.7%

Memory: ✅ 37.690MB (SLO: <39.000MB -3.4%) vs baseline: +4.9%


✅ re_fullmatch_aspect

Time: ✅ 2.704µs (SLO: <10.000µs 📉 -73.0%) vs baseline: +0.8%

Memory: ✅ 37.650MB (SLO: <39.000MB -3.5%) vs baseline: +4.7%


✅ re_fullmatch_noaspect

Time: ✅ 1.302µs (SLO: <10.000µs 📉 -87.0%) vs baseline: +0.6%

Memory: ✅ 37.670MB (SLO: <39.000MB -3.4%) vs baseline: +4.7%


✅ re_group_aspect

Time: ✅ 2.979µs (SLO: <10.000µs 📉 -70.2%) vs baseline: +0.8%

Memory: ✅ 37.670MB (SLO: <39.000MB -3.4%) vs baseline: +4.9%


✅ re_group_noaspect

Time: ✅ 1.612µs (SLO: <10.000µs 📉 -83.9%) vs baseline: ~same

Memory: ✅ 37.690MB (SLO: <39.000MB -3.4%) vs baseline: +4.8%


✅ re_groups_aspect

Time: ✅ 3.067µs (SLO: <10.000µs 📉 -69.3%) vs baseline: -0.6%

Memory: ✅ 37.729MB (SLO: <39.000MB -3.3%) vs baseline: +4.9%


✅ re_groups_noaspect

Time: ✅ 1.700µs (SLO: <10.000µs 📉 -83.0%) vs baseline: +1.1%

Memory: ✅ 37.631MB (SLO: <39.000MB -3.5%) vs baseline: +4.8%


✅ re_match_aspect

Time: ✅ 2.768µs (SLO: <10.000µs 📉 -72.3%) vs baseline: +1.9%

Memory: ✅ 37.650MB (SLO: <39.000MB -3.5%) vs baseline: +4.9%


✅ re_match_noaspect

Time: ✅ 1.311µs (SLO: <10.000µs 📉 -86.9%) vs baseline: +0.7%

Memory: ✅ 37.709MB (SLO: <39.000MB -3.3%) vs baseline: +4.9%


✅ re_search_aspect

Time: ✅ 2.590µs (SLO: <10.000µs 📉 -74.1%) vs baseline: +0.9%

Memory: ✅ 37.709MB (SLO: <39.000MB -3.3%) vs baseline: +5.0%


✅ re_search_noaspect

Time: ✅ 1.204µs (SLO: <10.000µs 📉 -88.0%) vs baseline: +0.8%

Memory: ✅ 37.631MB (SLO: <39.000MB -3.5%) vs baseline: +4.8%


✅ re_sub_aspect

Time: ✅ 3.471µs (SLO: <10.000µs 📉 -65.3%) vs baseline: +1.1%

Memory: ✅ 37.709MB (SLO: <39.000MB -3.3%) vs baseline: +5.0%


✅ re_sub_noaspect

Time: ✅ 1.539µs (SLO: <10.000µs 📉 -84.6%) vs baseline: ~same

Memory: ✅ 37.650MB (SLO: <39.000MB -3.5%) vs baseline: +4.6%


✅ re_subn_aspect

Time: ✅ 3.644µs (SLO: <10.000µs 📉 -63.6%) vs baseline: -0.7%

Memory: ✅ 37.690MB (SLO: <39.000MB -3.4%) vs baseline: +4.9%


✅ re_subn_noaspect

Time: ✅ 1.630µs (SLO: <10.000µs 📉 -83.7%) vs baseline: +0.9%

Memory: ✅ 37.709MB (SLO: <39.000MB -3.3%) vs baseline: +5.0%


iastaspectsospath - 24/24

✅ ospathbasename_aspect

Time: ✅ 4.345µs (SLO: <10.000µs 📉 -56.6%) vs baseline: +0.4%

Memory: ✅ 37.650MB (SLO: <39.000MB -3.5%) vs baseline: +4.8%


✅ ospathbasename_noaspect

Time: ✅ 1.087µs (SLO: <10.000µs 📉 -89.1%) vs baseline: -1.4%

Memory: ✅ 37.631MB (SLO: <39.000MB -3.5%) vs baseline: +4.7%


✅ ospathjoin_aspect

Time: ✅ 6.121µs (SLO: <10.000µs 📉 -38.8%) vs baseline: +0.2%

Memory: ✅ 37.670MB (SLO: <39.000MB -3.4%) vs baseline: +4.8%


✅ ospathjoin_noaspect

Time: ✅ 2.312µs (SLO: <10.000µs 📉 -76.9%) vs baseline: -0.1%

Memory: ✅ 37.631MB (SLO: <39.000MB -3.5%) vs baseline: +4.9%


✅ ospathnormcase_aspect

Time: ✅ 3.469µs (SLO: <10.000µs 📉 -65.3%) vs baseline: -0.5%

Memory: ✅ 37.729MB (SLO: <39.000MB -3.3%) vs baseline: +5.0%


✅ ospathnormcase_noaspect

Time: ✅ 0.571µs (SLO: <10.000µs 📉 -94.3%) vs baseline: -0.9%

Memory: ✅ 37.729MB (SLO: <39.000MB -3.3%) vs baseline: +5.0%


✅ ospathsplit_aspect

Time: ✅ 4.872µs (SLO: <10.000µs 📉 -51.3%) vs baseline: +0.2%

Memory: ✅ 37.650MB (SLO: <39.000MB -3.5%) vs baseline: +4.8%


✅ ospathsplit_noaspect

Time: ✅ 1.619µs (SLO: <10.000µs 📉 -83.8%) vs baseline: +0.5%

Memory: ✅ 37.611MB (SLO: <39.000MB -3.6%) vs baseline: +4.7%


✅ ospathsplitdrive_aspect

Time: ✅ 3.638µs (SLO: <10.000µs 📉 -63.6%) vs baseline: -0.6%

Memory: ✅ 37.709MB (SLO: <39.000MB -3.3%) vs baseline: +4.9%


✅ ospathsplitdrive_noaspect

Time: ✅ 0.703µs (SLO: <10.000µs 📉 -93.0%) vs baseline: +0.7%

Memory: ✅ 37.709MB (SLO: <39.000MB -3.3%) vs baseline: +5.0%


✅ ospathsplitext_aspect

Time: ✅ 4.546µs (SLO: <10.000µs 📉 -54.5%) vs baseline: -0.8%

Memory: ✅ 37.670MB (SLO: <39.000MB -3.4%) vs baseline: +4.7%


✅ ospathsplitext_noaspect

Time: ✅ 1.378µs (SLO: <10.000µs 📉 -86.2%) vs baseline: -0.7%

Memory: ✅ 37.650MB (SLO: <39.000MB -3.5%) vs baseline: +4.8%


iastaspectssplit - 12/12

✅ rsplit_aspect

Time: ✅ 1.438µs (SLO: <10.000µs 📉 -85.6%) vs baseline: +1.0%

Memory: ✅ 37.709MB (SLO: <39.000MB -3.3%) vs baseline: +4.8%


✅ rsplit_noaspect

Time: ✅ 0.580µs (SLO: <10.000µs 📉 -94.2%) vs baseline: -0.1%

Memory: ✅ 37.690MB (SLO: <39.000MB -3.4%) vs baseline: +5.0%


✅ split_aspect

Time: ✅ 1.429µs (SLO: <10.000µs 📉 -85.7%) vs baseline: +0.8%

Memory: ✅ 37.631MB (SLO: <39.000MB -3.5%) vs baseline: +4.6%


✅ split_noaspect

Time: ✅ 0.573µs (SLO: <10.000µs 📉 -94.3%) vs baseline: ~same

Memory: ✅ 37.650MB (SLO: <39.000MB -3.5%) vs baseline: +4.7%


✅ splitlines_aspect

Time: ✅ 1.410µs (SLO: <10.000µs 📉 -85.9%) vs baseline: -0.1%

Memory: ✅ 37.729MB (SLO: <39.000MB -3.3%) vs baseline: +4.9%


✅ splitlines_noaspect

Time: ✅ 0.585µs (SLO: <10.000µs 📉 -94.1%) vs baseline: +0.1%

Memory: ✅ 37.690MB (SLO: <39.000MB -3.4%) vs baseline: +5.1%


iastpropagation - 2/2

✅ no-propagation

Time: ✅ 49.153µs (SLO: <60.000µs 📉 -18.1%) vs baseline: -0.4%

Memory: ✅ 37.729MB (SLO: <39.000MB -3.3%) vs baseline: +4.8%


otelsdkspan - 24/24

✅ add-event

Time: ✅ 40.212ms (SLO: <42.000ms -4.3%) vs baseline: -0.3%

Memory: ✅ 34.524MB (SLO: <39.000MB 📉 -11.5%) vs baseline: +4.7%


✅ add-link

Time: ✅ 36.223ms (SLO: <38.550ms -6.0%) vs baseline: ~same

Memory: ✅ 34.426MB (SLO: <39.000MB 📉 -11.7%) vs baseline: +4.5%


✅ add-metrics

Time: ✅ 219.411ms (SLO: <232.000ms -5.4%) vs baseline: +0.2%

Memory: ✅ 34.505MB (SLO: <39.000MB 📉 -11.5%) vs baseline: +3.7%


✅ add-tags

Time: ✅ 211.480ms (SLO: <221.600ms -4.6%) vs baseline: -0.2%

Memory: ✅ 34.485MB (SLO: <39.000MB 📉 -11.6%) vs baseline: +4.7%


✅ get-context

Time: ✅ 29.321ms (SLO: <31.300ms -6.3%) vs baseline: +1.3%

Memory: ✅ 34.465MB (SLO: <39.000MB 📉 -11.6%) vs baseline: +4.8%


✅ is-recording

Time: ✅ 29.081ms (SLO: <31.000ms -6.2%) vs baseline: ~same

Memory: ✅ 34.524MB (SLO: <39.000MB 📉 -11.5%) vs baseline: +4.8%


✅ record-exception

Time: ✅ 63.272ms (SLO: <65.850ms -3.9%) vs baseline: +0.3%

Memory: ✅ 34.898MB (SLO: <39.000MB 📉 -10.5%) vs baseline: +4.9%


✅ set-status

Time: ✅ 31.967ms (SLO: <34.150ms -6.4%) vs baseline: +0.5%

Memory: ✅ 34.583MB (SLO: <39.000MB 📉 -11.3%) vs baseline: +4.8%


✅ start

Time: ✅ 28.819ms (SLO: <30.150ms -4.4%) vs baseline: +0.4%

Memory: ✅ 34.524MB (SLO: <39.000MB 📉 -11.5%) vs baseline: +4.7%


✅ start-finish

Time: ✅ 34.004ms (SLO: <35.350ms -3.8%) vs baseline: +0.3%

Memory: ✅ 34.564MB (SLO: <39.000MB 📉 -11.4%) vs baseline: +4.9%


✅ start-finish-telemetry

Time: ✅ 33.785ms (SLO: <35.450ms -4.7%) vs baseline: ~same

Memory: ✅ 34.544MB (SLO: <39.000MB 📉 -11.4%) vs baseline: +4.9%


✅ update-name

Time: ✅ 30.997ms (SLO: <33.400ms -7.2%) vs baseline: +0.5%

Memory: ✅ 34.544MB (SLO: <39.000MB 📉 -11.4%) vs baseline: +4.6%


packagespackageforrootmodulemapping - 4/4

✅ cache_off

Time: ✅ 341.495ms (SLO: <354.300ms -3.6%) vs baseline: -0.6%

Memory: ✅ 37.771MB (SLO: <40.000MB -5.6%) vs baseline: +4.7%


✅ cache_on

Time: ✅ 0.381µs (SLO: <10.000µs 📉 -96.2%) vs baseline: -2.1%

Memory: ✅ 37.631MB (SLO: <39.000MB -3.5%) vs baseline: +4.9%


packagesupdateimporteddependencies - 24/24

✅ import_many

Time: ✅ 154.894µs (SLO: <170.000µs -8.9%) vs baseline: -0.5%

Memory: ✅ 37.237MB (SLO: <38.500MB -3.3%) vs baseline: +3.8%


✅ import_many_cached

Time: ✅ 120.466µs (SLO: <130.000µs -7.3%) vs baseline: -0.6%

Memory: ✅ 36.889MB (SLO: <38.500MB -4.2%) vs baseline: +4.8%


✅ import_many_stdlib

Time: ✅ 1.640ms (SLO: <1.750ms -6.3%) vs baseline: +0.7%

Memory: ✅ 37.392MB (SLO: <38.500MB -2.9%) vs baseline: +5.2%


✅ import_many_stdlib_cached

Time: ✅ 0.981ms (SLO: <1.100ms 📉 -10.8%) vs baseline: -0.7%

Memory: ✅ 37.332MB (SLO: <38.500MB -3.0%) vs baseline: +5.7%


✅ import_many_unknown

Time: ✅ 827.243µs (SLO: <890.000µs -7.1%) vs baseline: -1.7%

Memory: ✅ 37.072MB (SLO: <38.500MB -3.7%) vs baseline: +4.8%


✅ import_many_unknown_cached

Time: ✅ 787.246µs (SLO: <870.000µs -9.5%) vs baseline: -0.8%

Memory: ✅ 37.039MB (SLO: <38.500MB -3.8%) vs baseline: +3.9%


✅ import_one

Time: ✅ 19.683µs (SLO: <30.000µs 📉 -34.4%) vs baseline: -0.8%

Memory: ✅ 37.136MB (SLO: <39.000MB -4.8%) vs baseline: +4.8%


✅ import_one_cache

Time: ✅ 6.321µs (SLO: <10.000µs 📉 -36.8%) vs baseline: +0.6%

Memory: ✅ 36.963MB (SLO: <38.500MB -4.0%) vs baseline: +5.3%


✅ import_one_stdlib

Time: ✅ 18.763µs (SLO: <20.000µs -6.2%) vs baseline: +0.6%

Memory: ✅ 36.883MB (SLO: <38.500MB -4.2%) vs baseline: +4.9%


✅ import_one_stdlib_cache

Time: ✅ 6.335µs (SLO: <10.000µs 📉 -36.6%) vs baseline: +0.4%

Memory: ✅ 36.780MB (SLO: <38.500MB -4.5%) vs baseline: +3.8%


✅ import_one_unknown

Time: ✅ 45.472µs (SLO: <50.000µs -9.1%) vs baseline: +0.2%

Memory: ✅ 36.955MB (SLO: <38.500MB -4.0%) vs baseline: +4.2%


✅ import_one_unknown_cache

Time: ✅ 6.293µs (SLO: <10.000µs 📉 -37.1%) vs baseline: ~same

Memory: ✅ 36.894MB (SLO: <38.500MB -4.2%) vs baseline: +4.2%


ratelimiter - 12/12

✅ defaults

Time: ✅ 2.355µs (SLO: <10.000µs 📉 -76.4%) vs baseline: ~same

Memory: ✅ 31.792MB (SLO: <34.000MB -6.5%) vs baseline: +5.0%


✅ high_rate_limit

Time: ✅ 2.436µs (SLO: <10.000µs 📉 -75.6%) vs baseline: +1.8%

Memory: ✅ 31.831MB (SLO: <34.000MB -6.4%) vs baseline: +4.9%


✅ long_window

Time: ✅ 2.364µs (SLO: <10.000µs 📉 -76.4%) vs baseline: +0.4%

Memory: ✅ 31.792MB (SLO: <34.000MB -6.5%) vs baseline: +5.0%


✅ low_rate_limit

Time: ✅ 2.354µs (SLO: <10.000µs 📉 -76.5%) vs baseline: +0.8%

Memory: ✅ 31.772MB (SLO: <34.000MB -6.6%) vs baseline: +5.0%


✅ no_rate_limit

Time: ✅ 0.836µs (SLO: <10.000µs 📉 -91.6%) vs baseline: +1.1%

Memory: ✅ 31.713MB (SLO: <34.000MB -6.7%) vs baseline: +4.6%


✅ short_window

Time: ✅ 2.509µs (SLO: <10.000µs 📉 -74.9%) vs baseline: +0.2%

Memory: ✅ 31.772MB (SLO: <34.000MB -6.6%) vs baseline: +4.9%


recursivecomputation - 8/8

✅ deep

Time: ✅ 310.122ms (SLO: <320.950ms -3.4%) vs baseline: +0.2%

Memory: ✅ 32.932MB (SLO: <34.500MB -4.5%) vs baseline: +4.9%


✅ deep-profiled

Time: ✅ 327.627ms (SLO: <359.150ms -8.8%) vs baseline: ~same

Memory: ✅ 37.450MB (SLO: <39.000MB -4.0%) vs baseline: +4.3%


✅ medium

Time: ✅ 7.068ms (SLO: <7.400ms -4.5%) vs baseline: ~same

Memory: ✅ 32.145MB (SLO: <34.000MB -5.5%) vs baseline: +4.9%


✅ shallow

Time: ✅ 0.954ms (SLO: <1.050ms -9.1%) vs baseline: -0.2%

Memory: ✅ 32.126MB (SLO: <34.000MB -5.5%) vs baseline: +4.9%


samplingrules - 8/8

✅ average_match

Time: ✅ 137.954µs (SLO: <290.000µs 📉 -52.4%) vs baseline: ~same

Memory: ✅ 32.165MB (SLO: <34.000MB -5.4%) vs baseline: +5.1%


✅ high_match

Time: ✅ 174.885µs (SLO: <480.000µs 📉 -63.6%) vs baseline: +0.4%

Memory: ✅ 32.126MB (SLO: <34.000MB -5.5%) vs baseline: +4.7%


✅ low_match

Time: ✅ 98.187µs (SLO: <120.000µs 📉 -18.2%) vs baseline: -1.3%

Memory: ✅ 600.965MB (SLO: <700.000MB 📉 -14.1%) vs baseline: +4.9%


✅ very_low_match

Time: ✅ 2.665ms (SLO: <8.500ms 📉 -68.6%) vs baseline: -0.4%

Memory: ✅ 68.225MB (SLO: <75.000MB -9.0%) vs baseline: +4.7%


sethttpmeta - 32/32

✅ all-disabled

Time: ✅ 10.530µs (SLO: <20.000µs 📉 -47.3%) vs baseline: +0.6%

Memory: ✅ 32.499MB (SLO: <34.000MB -4.4%) vs baseline: +4.5%


✅ all-enabled

Time: ✅ 40.060µs (SLO: <50.000µs 📉 -19.9%) vs baseline: -0.3%

Memory: ✅ 32.539MB (SLO: <34.000MB -4.3%) vs baseline: +4.9%


✅ collectipvariant_exists

Time: ✅ 40.816µs (SLO: <50.000µs 📉 -18.4%) vs baseline: +0.4%

Memory: ✅ 32.637MB (SLO: <34.000MB -4.0%) vs baseline: +5.1%


✅ no-collectipvariant

Time: ✅ 41.198µs (SLO: <50.000µs 📉 -17.6%) vs baseline: +2.6%

Memory: ✅ 32.617MB (SLO: <34.000MB -4.1%) vs baseline: +5.0%


✅ no-useragentvariant

Time: ✅ 38.804µs (SLO: <50.000µs 📉 -22.4%) vs baseline: ~same

Memory: ✅ 32.539MB (SLO: <34.000MB -4.3%) vs baseline: +4.9%


✅ obfuscation-no-query

Time: ✅ 40.544µs (SLO: <50.000µs 📉 -18.9%) vs baseline: ~same

Memory: ✅ 32.578MB (SLO: <34.000MB -4.2%) vs baseline: +4.8%


✅ obfuscation-regular-case-explicit-query

Time: ✅ 76.275µs (SLO: <90.000µs 📉 -15.2%) vs baseline: +0.8%

Memory: ✅ 32.971MB (SLO: <34.000MB -3.0%) vs baseline: +4.9%


✅ obfuscation-regular-case-implicit-query

Time: ✅ 76.636µs (SLO: <90.000µs 📉 -14.8%) vs baseline: +0.3%

Memory: ✅ 32.971MB (SLO: <34.000MB -3.0%) vs baseline: +4.8%


✅ obfuscation-send-querystring-disabled

Time: ✅ 154.405µs (SLO: <170.000µs -9.2%) vs baseline: ~same

Memory: ✅ 32.952MB (SLO: <34.500MB -4.5%) vs baseline: +4.9%


✅ obfuscation-worst-case-explicit-query

Time: ✅ 149.863µs (SLO: <160.000µs -6.3%) vs baseline: +0.8%

Memory: ✅ 33.010MB (SLO: <34.500MB -4.3%) vs baseline: +5.2%


✅ obfuscation-worst-case-implicit-query

Time: ✅ 155.031µs (SLO: <170.000µs -8.8%) vs baseline: -0.1%

Memory: ✅ 33.030MB (SLO: <34.500MB -4.3%) vs baseline: +5.0%


✅ useragentvariant_exists_1

Time: ✅ 39.687µs (SLO: <50.000µs 📉 -20.6%) vs baseline: +0.4%

Memory: ✅ 32.558MB (SLO: <34.000MB -4.2%) vs baseline: +4.7%


✅ useragentvariant_exists_2

Time: ✅ 40.715µs (SLO: <50.000µs 📉 -18.6%) vs baseline: +0.4%

Memory: ✅ 32.558MB (SLO: <34.000MB -4.2%) vs baseline: +4.9%


✅ useragentvariant_exists_3

Time: ✅ 40.065µs (SLO: <50.000µs 📉 -19.9%) vs baseline: -0.2%

Memory: ✅ 32.558MB (SLO: <34.000MB -4.2%) vs baseline: +4.8%


✅ useragentvariant_not_exists_1

Time: ✅ 39.650µs (SLO: <50.000µs 📉 -20.7%) vs baseline: +0.3%

Memory: ✅ 32.598MB (SLO: <34.000MB -4.1%) vs baseline: +5.0%


✅ useragentvariant_not_exists_2

Time: ✅ 39.529µs (SLO: <50.000µs 📉 -20.9%) vs baseline: +0.1%

Memory: ✅ 32.598MB (SLO: <34.000MB -4.1%) vs baseline: +4.8%


telemetryaddmetric - 30/30

✅ 1-count-metric-1-times

Time: ✅ 3.322µs (SLO: <20.000µs 📉 -83.4%) vs baseline: +6.1%

Memory: ✅ 32.145MB (SLO: <34.000MB -5.5%) vs baseline: +5.0%


✅ 1-count-metrics-100-times

Time: ✅ 214.468µs (SLO: <250.000µs 📉 -14.2%) vs baseline: -0.3%

Memory: ✅ 32.145MB (SLO: <34.000MB -5.5%) vs baseline: +4.7%


✅ 1-distribution-metric-1-times

Time: ✅ 2.948µs (SLO: <20.000µs 📉 -85.3%) vs baseline: +1.6%

Memory: ✅ 32.106MB (SLO: <34.000MB -5.6%) vs baseline: +4.9%


✅ 1-distribution-metrics-100-times

Time: ✅ 191.116µs (SLO: <220.000µs 📉 -13.1%) vs baseline: -0.2%

Memory: ✅ 32.126MB (SLO: <34.000MB -5.5%) vs baseline: +4.8%


✅ 1-gauge-metric-1-times

Time: ✅ 2.109µs (SLO: <20.000µs 📉 -89.5%) vs baseline: +0.7%

Memory: ✅ 32.027MB (SLO: <34.000MB -5.8%) vs baseline: +4.6%


✅ 1-gauge-metrics-100-times

Time: ✅ 125.274µs (SLO: <150.000µs 📉 -16.5%) vs baseline: -0.3%

Memory: ✅ 32.126MB (SLO: <34.000MB -5.5%) vs baseline: +4.8%


✅ 1-rate-metric-1-times

Time: ✅ 3.341µs (SLO: <20.000µs 📉 -83.3%) vs baseline: +7.1%

Memory: ✅ 32.126MB (SLO: <34.000MB -5.5%) vs baseline: +4.7%


✅ 1-rate-metrics-100-times

Time: ✅ 213.230µs (SLO: <250.000µs 📉 -14.7%) vs baseline: +0.7%

Memory: ✅ 32.126MB (SLO: <34.000MB -5.5%) vs baseline: +5.1%


✅ 100-count-metrics-100-times

Time: ✅ 21.598ms (SLO: <23.500ms -8.1%) vs baseline: +1.1%

Memory: ✅ 32.145MB (SLO: <34.000MB -5.5%) vs baseline: +5.0%


✅ 100-distribution-metrics-100-times

Time: ✅ 2.004ms (SLO: <2.250ms 📉 -10.9%) vs baseline: +0.1%

Memory: ✅ 32.204MB (SLO: <34.000MB -5.3%) vs baseline: +5.1%


✅ 100-gauge-metrics-100-times

Time: ✅ 1.304ms (SLO: <1.550ms 📉 -15.9%) vs baseline: +0.6%

Memory: ✅ 32.145MB (SLO: <34.000MB -5.5%) vs baseline: +5.1%


✅ 100-rate-metrics-100-times

Time: ✅ 2.238ms (SLO: <2.550ms 📉 -12.2%) vs baseline: +2.0%

Memory: ✅ 32.106MB (SLO: <34.000MB -5.6%) vs baseline: +4.5%


✅ flush-1-metric

Time: ✅ 4.296µs (SLO: <20.000µs 📉 -78.5%) vs baseline: +1.1%

Memory: ✅ 32.145MB (SLO: <34.000MB -5.5%) vs baseline: +4.9%


✅ flush-100-metrics

Time: ✅ 181.852µs (SLO: <250.000µs 📉 -27.3%) vs baseline: -0.2%

Memory: ✅ 32.126MB (SLO: <34.000MB -5.5%) vs baseline: +5.1%


✅ flush-1000-metrics

Time: ✅ 2.223ms (SLO: <2.500ms 📉 -11.1%) vs baseline: +0.1%

Memory: ✅ 32.932MB (SLO: <34.500MB -4.5%) vs baseline: +4.9%


tracer - 6/6

✅ large

Time: ✅ 30.176ms (SLO: <32.950ms -8.4%) vs baseline: -0.2%

Memory: ✅ 32.932MB (SLO: <34.500MB -4.5%) vs baseline: +3.9%


✅ medium

Time: ✅ 2.958ms (SLO: <3.200ms -7.6%) vs baseline: +0.5%

Memory: ✅ 31.713MB (SLO: <34.000MB -6.7%) vs baseline: +3.3%


✅ small

Time: ✅ 334.324µs (SLO: <370.000µs -9.6%) vs baseline: +0.4%

Memory: ✅ 32.165MB (SLO: <34.000MB -5.4%) vs baseline: +5.1%

ℹ️ Scenarios Missing SLO Configuration (9 scenarios)

The following scenarios exist in candidate data but have no SLO thresholds configured:

  • coreapiscenario-core_dispatch_listeners
  • coreapiscenario-core_dispatch_no_listeners
  • coreapiscenario-core_dispatch_with_results_listeners
  • coreapiscenario-core_dispatch_with_results_no_listeners
  • djangosimple-baseline
  • errortrackingdjangosimple-baseline
  • errortrackingflasksqli-baseline
  • flasksimple-baseline
  • flasksqli-baseline

Copy link
Member

@brettlangdon brettlangdon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

release note lgtm

@Yun-Kim Yun-Kim enabled auto-merge (squash) October 17, 2025 14:29
@Yun-Kim Yun-Kim merged commit f7b8ed5 into main Oct 17, 2025
502 checks passed
@Yun-Kim Yun-Kim deleted the yunkim/llmobs-evals-assessment branch October 17, 2025 15:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants