-
Notifications
You must be signed in to change notification settings - Fork 461
feat(llmobs): add reasoning for custom evals #14919
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Bootstrap import analysisComparison of import times between this PR and base. SummaryThe average import time from this PR is: 243 ± 5 ms. The average import time from base is: 247 ± 3 ms. The import time difference between this PR and base is: -3.3 ± 0.2 ms. Import time breakdownThe following import paths have shrunk:
|
Performance SLOsComparing candidate yunkim/llmobs-evals-assessment (dcaec5a) with baseline main (23fe9e1) 📈 Performance Regressions (1 suite)📈 iastaspects - 118/118✅ add_aspectTime: ✅ 0.403µs (SLO: <10.000µs 📉 -96.0%) vs baseline: -1.2% Memory: ✅ 37.631MB (SLO: <39.000MB -3.5%) vs baseline: +4.7% ✅ add_inplace_aspectTime: ✅ 0.405µs (SLO: <10.000µs 📉 -95.9%) vs baseline: -1.1% Memory: ✅ 37.709MB (SLO: <39.000MB -3.3%) vs baseline: +5.0% ✅ add_inplace_noaspectTime: ✅ 0.314µs (SLO: <10.000µs 📉 -96.9%) vs baseline: ~same Memory: ✅ 37.670MB (SLO: <39.000MB -3.4%) vs baseline: +5.0% ✅ add_noaspectTime: ✅ 0.278µs (SLO: <10.000µs 📉 -97.2%) vs baseline: +0.1% Memory: ✅ 37.690MB (SLO: <39.000MB -3.4%) vs baseline: +5.0% ✅ bytearray_aspectTime: ✅ 1.325µs (SLO: <10.000µs 📉 -86.7%) vs baseline: -1.1% Memory: ✅ 37.631MB (SLO: <39.000MB -3.5%) vs baseline: +4.7% ✅ bytearray_extend_aspectTime: ✅ 1.455µs (SLO: <10.000µs 📉 -85.4%) vs baseline: +0.9% Memory: ✅ 37.709MB (SLO: <39.000MB -3.3%) vs baseline: +4.7% ✅ bytearray_extend_noaspectTime: ✅ 0.620µs (SLO: <10.000µs 📉 -93.8%) vs baseline: +0.9% Memory: ✅ 37.709MB (SLO: <39.000MB -3.3%) vs baseline: +4.9% ✅ bytearray_noaspectTime: ✅ 0.485µs (SLO: <10.000µs 📉 -95.1%) vs baseline: +0.4% Memory: ✅ 37.690MB (SLO: <39.000MB -3.4%) vs baseline: +5.0% ✅ bytes_aspectTime: ✅ 1.303µs (SLO: <10.000µs 📉 -87.0%) vs baseline: +1.6% Memory: ✅ 37.670MB (SLO: <39.000MB -3.4%) vs baseline: +4.7% ✅ bytes_noaspectTime: ✅ 0.494µs (SLO: <10.000µs 📉 -95.1%) vs baseline: -0.4% Memory: ✅ 37.650MB (SLO: <39.000MB -3.5%) vs baseline: +4.9% ✅ bytesio_aspectTime: ✅ 1.362µs (SLO: <10.000µs 📉 -86.4%) vs baseline: -0.2% Memory: ✅ 37.650MB (SLO: <39.000MB -3.5%) vs baseline: +4.7% ✅ bytesio_noaspectTime: ✅ 0.497µs (SLO: <10.000µs 📉 -95.0%) vs baseline: +0.5% Memory: ✅ 37.729MB (SLO: <39.000MB -3.3%) vs baseline: +5.1% ✅ capitalize_aspectTime: ✅ 0.729µs (SLO: <10.000µs 📉 -92.7%) vs baseline: -1.2% Memory: ✅ 37.690MB (SLO: <39.000MB -3.4%) vs baseline: +5.0% ✅ capitalize_noaspectTime: ✅ 0.434µs (SLO: <10.000µs 📉 -95.7%) vs baseline: -0.9% Memory: ✅ 37.611MB (SLO: <39.000MB -3.6%) vs baseline: +4.8% ✅ casefold_aspectTime: ✅ 0.735µs (SLO: <10.000µs 📉 -92.6%) vs baseline: ~same Memory: ✅ 37.690MB (SLO: <39.000MB -3.4%) vs baseline: +4.9% ✅ casefold_noaspectTime: ✅ 0.369µs (SLO: <10.000µs 📉 -96.3%) vs baseline: +0.2% Memory: ✅ 37.709MB (SLO: <39.000MB -3.3%) vs baseline: +5.0% ✅ decode_aspectTime: ✅ 0.723µs (SLO: <10.000µs 📉 -92.8%) vs baseline: -1.3% Memory: ✅ 37.631MB (SLO: <39.000MB -3.5%) vs baseline: +4.7% ✅ decode_noaspectTime: ✅ 0.421µs (SLO: <10.000µs 📉 -95.8%) vs baseline: -1.2% Memory: ✅ 37.650MB (SLO: <39.000MB -3.5%) vs baseline: +4.9% ✅ encode_aspectTime: ✅ 0.713µs (SLO: <10.000µs 📉 -92.9%) vs baseline: +0.9% Memory: ✅ 37.670MB (SLO: <39.000MB -3.4%) vs baseline: +4.9% ✅ encode_noaspectTime: ✅ 0.405µs (SLO: <10.000µs 📉 -95.9%) vs baseline: -0.2% Memory: ✅ 37.749MB (SLO: <39.000MB -3.2%) vs baseline: +5.1% ✅ format_aspectTime: ✅ 3.459µs (SLO: <10.000µs 📉 -65.4%) vs baseline: +2.1% Memory: ✅ 37.690MB (SLO: <39.000MB -3.4%) vs baseline: +4.7% ✅ format_map_aspectTime: ✅ 4.146µs (SLO: <10.000µs 📉 -58.5%) vs baseline: 📈 +13.9% Memory: ✅ 37.690MB (SLO: <39.000MB -3.4%) vs baseline: +4.9% ✅ format_map_noaspectTime: ✅ 0.776µs (SLO: <10.000µs 📉 -92.2%) vs baseline: -0.2% Memory: ✅ 37.709MB (SLO: <39.000MB -3.3%) vs baseline: +5.0% ✅ format_noaspectTime: ✅ 0.595µs (SLO: <10.000µs 📉 -94.1%) vs baseline: -0.4% Memory: ✅ 37.690MB (SLO: <39.000MB -3.4%) vs baseline: +4.7% ✅ index_aspectTime: ✅ 0.353µs (SLO: <10.000µs 📉 -96.5%) vs baseline: -1.2% Memory: ✅ 37.650MB (SLO: <39.000MB -3.5%) vs baseline: +4.9% ✅ index_noaspectTime: ✅ 0.278µs (SLO: <10.000µs 📉 -97.2%) vs baseline: -0.3% Memory: ✅ 37.670MB (SLO: <39.000MB -3.4%) vs baseline: +4.9% ✅ join_aspectTime: ✅ 1.387µs (SLO: <10.000µs 📉 -86.1%) vs baseline: -1.1% Memory: ✅ 37.709MB (SLO: <39.000MB -3.3%) vs baseline: +5.0% ✅ join_noaspectTime: ✅ 0.496µs (SLO: <10.000µs 📉 -95.0%) vs baseline: +1.2% Memory: ✅ 37.670MB (SLO: <39.000MB -3.4%) vs baseline: +4.7% ✅ ljust_aspectTime: ✅ 2.486µs (SLO: <20.000µs 📉 -87.6%) vs baseline: -0.7% Memory: ✅ 37.611MB (SLO: <39.000MB -3.6%) vs baseline: +4.6% ✅ ljust_noaspectTime: ✅ 0.405µs (SLO: <10.000µs 📉 -96.0%) vs baseline: ~same Memory: ✅ 37.729MB (SLO: <39.000MB -3.3%) vs baseline: +5.0% ✅ lower_aspectTime: ✅ 2.234µs (SLO: <10.000µs 📉 -77.7%) vs baseline: +0.5% Memory: ✅ 37.611MB (SLO: <39.000MB -3.6%) vs baseline: +4.7% ✅ lower_noaspectTime: ✅ 0.368µs (SLO: <10.000µs 📉 -96.3%) vs baseline: -1.2% Memory: ✅ 37.670MB (SLO: <39.000MB -3.4%) vs baseline: +4.7% ✅ lstrip_aspectTime: ✅ 2.242µs (SLO: <20.000µs 📉 -88.8%) vs baseline: +0.2% Memory: ✅ 37.690MB (SLO: <39.000MB -3.4%) vs baseline: +4.9% ✅ lstrip_noaspectTime: ✅ 0.378µs (SLO: <10.000µs 📉 -96.2%) vs baseline: -1.1% Memory: ✅ 37.670MB (SLO: <39.000MB -3.4%) vs baseline: +4.9% ✅ modulo_aspectTime: ✅ 0.995µs (SLO: <10.000µs 📉 -90.0%) vs baseline: -1.3% Memory: ✅ 37.729MB (SLO: <39.000MB -3.3%) vs baseline: +4.9% ✅ modulo_aspect_for_bytearray_bytearrayTime: ✅ 1.546µs (SLO: <10.000µs 📉 -84.5%) vs baseline: +0.2% Memory: ✅ 37.670MB (SLO: <39.000MB -3.4%) vs baseline: +4.9% ✅ modulo_aspect_for_bytesTime: ✅ 0.988µs (SLO: <10.000µs 📉 -90.1%) vs baseline: +0.8% Memory: ✅ 37.591MB (SLO: <39.000MB -3.6%) vs baseline: +4.6% ✅ modulo_aspect_for_bytes_bytearrayTime: ✅ 1.204µs (SLO: <10.000µs 📉 -88.0%) vs baseline: +0.1% Memory: ✅ 37.690MB (SLO: <39.000MB -3.4%) vs baseline: +5.0% ✅ modulo_noaspectTime: ✅ 0.629µs (SLO: <10.000µs 📉 -93.7%) vs baseline: -0.1% Memory: ✅ 37.690MB (SLO: <39.000MB -3.4%) vs baseline: +4.9% ✅ replace_aspectTime: ✅ 4.793µs (SLO: <10.000µs 📉 -52.1%) vs baseline: -0.1% Memory: ✅ 37.690MB (SLO: <39.000MB -3.4%) vs baseline: +4.7% ✅ replace_noaspectTime: ✅ 0.462µs (SLO: <10.000µs 📉 -95.4%) vs baseline: -0.1% Memory: ✅ 37.572MB (SLO: <39.000MB -3.7%) vs baseline: +4.6% ✅ repr_aspectTime: ✅ 0.903µs (SLO: <10.000µs 📉 -91.0%) vs baseline: +0.1% Memory: ✅ 37.690MB (SLO: <39.000MB -3.4%) vs baseline: +4.9% ✅ repr_noaspectTime: ✅ 0.416µs (SLO: <10.000µs 📉 -95.8%) vs baseline: +0.8% Memory: ✅ 37.709MB (SLO: <39.000MB -3.3%) vs baseline: +5.0% ✅ rstrip_aspectTime: ✅ 1.907µs (SLO: <20.000µs 📉 -90.5%) vs baseline: -0.8% Memory: ✅ 37.670MB (SLO: <39.000MB -3.4%) vs baseline: +5.0% ✅ rstrip_noaspectTime: ✅ 0.380µs (SLO: <10.000µs 📉 -96.2%) vs baseline: +0.3% Memory: ✅ 37.650MB (SLO: <39.000MB -3.5%) vs baseline: +4.8% ✅ slice_aspectTime: ✅ 0.495µs (SLO: <10.000µs 📉 -95.1%) vs baseline: -0.2% Memory: ✅ 37.611MB (SLO: <39.000MB -3.6%) vs baseline: +4.7% ✅ slice_noaspectTime: ✅ 0.447µs (SLO: <10.000µs 📉 -95.5%) vs baseline: -0.6% Memory: ✅ 37.670MB (SLO: <39.000MB -3.4%) vs baseline: +4.9% ✅ stringio_aspectTime: ✅ 1.529µs (SLO: <10.000µs 📉 -84.7%) vs baseline: ~same Memory: ✅ 37.690MB (SLO: <39.000MB -3.4%) vs baseline: +4.9% ✅ stringio_noaspectTime: ✅ 0.718µs (SLO: <10.000µs 📉 -92.8%) vs baseline: -0.7% Memory: ✅ 37.690MB (SLO: <39.000MB -3.4%) vs baseline: +4.9% ✅ strip_aspectTime: ✅ 2.205µs (SLO: <20.000µs 📉 -89.0%) vs baseline: -0.5% Memory: ✅ 37.631MB (SLO: <39.000MB -3.5%) vs baseline: +4.9% ✅ strip_noaspectTime: ✅ 0.387µs (SLO: <10.000µs 📉 -96.1%) vs baseline: +1.0% Memory: ✅ 37.709MB (SLO: <39.000MB -3.3%) vs baseline: +4.8% ✅ swapcase_aspectTime: ✅ 2.526µs (SLO: <10.000µs 📉 -74.7%) vs baseline: +4.7% Memory: ✅ 37.709MB (SLO: <39.000MB -3.3%) vs baseline: +4.7% ✅ swapcase_noaspectTime: ✅ 0.538µs (SLO: <10.000µs 📉 -94.6%) vs baseline: +0.8% Memory: ✅ 37.650MB (SLO: <39.000MB -3.5%) vs baseline: +4.8% ✅ title_aspectTime: ✅ 2.441µs (SLO: <10.000µs 📉 -75.6%) vs baseline: +3.8% Memory: ✅ 37.650MB (SLO: <39.000MB -3.5%) vs baseline: +4.7% ✅ title_noaspectTime: ✅ 0.505µs (SLO: <10.000µs 📉 -95.0%) vs baseline: +0.7% Memory: ✅ 37.670MB (SLO: <39.000MB -3.4%) vs baseline: +4.8% ✅ translate_aspectTime: ✅ 3.321µs (SLO: <10.000µs 📉 -66.8%) vs baseline: +2.8% Memory: ✅ 37.650MB (SLO: <39.000MB -3.5%) vs baseline: +4.6% ✅ translate_noaspectTime: ✅ 1.041µs (SLO: <10.000µs 📉 -89.6%) vs baseline: -0.3% Memory: ✅ 37.690MB (SLO: <39.000MB -3.4%) vs baseline: +5.0% ✅ upper_aspectTime: ✅ 2.236µs (SLO: <10.000µs 📉 -77.6%) vs baseline: -1.0% Memory: ✅ 37.690MB (SLO: <39.000MB -3.4%) vs baseline: +4.9% ✅ upper_noaspectTime: ✅ 0.369µs (SLO: <10.000µs 📉 -96.3%) vs baseline: -0.7% Memory: ✅ 37.670MB (SLO: <39.000MB -3.4%) vs baseline: +4.9% 🟡 Near SLO Breach (5 suites)🟡 djangosimple - 30/30✅ appsecTime: ✅ 20.475ms (SLO: <22.300ms -8.2%) vs baseline: ~same Memory: ✅ 65.447MB (SLO: <67.000MB -2.3%) vs baseline: +4.8% ✅ exception-replay-enabledTime: ✅ 1.348ms (SLO: <1.450ms -7.0%) vs baseline: -0.1% Memory: ✅ 64.640MB (SLO: <67.000MB -3.5%) vs baseline: +4.9% ✅ iastTime: ✅ 20.428ms (SLO: <22.250ms -8.2%) vs baseline: -0.3% Memory: ✅ 65.476MB (SLO: <67.000MB -2.3%) vs baseline: +5.0% ✅ profilerTime: ✅ 15.300ms (SLO: <16.550ms -7.6%) vs baseline: -0.3% Memory: ✅ 53.732MB (SLO: <54.500MB 🟡 -1.4%) vs baseline: +5.0% ✅ resource-renamingTime: ✅ 20.547ms (SLO: <21.750ms -5.5%) vs baseline: +0.1% Memory: ✅ 65.482MB (SLO: <67.000MB -2.3%) vs baseline: +4.9% ✅ span-code-originTime: ✅ 26.198ms (SLO: <28.200ms -7.1%) vs baseline: -0.2% Memory: ✅ 67.633MB (SLO: <69.500MB -2.7%) vs baseline: +4.8% ✅ tracerTime: ✅ 20.527ms (SLO: <21.750ms -5.6%) vs baseline: ~same Memory: ✅ 65.363MB (SLO: <67.000MB -2.4%) vs baseline: +4.8% ✅ tracer-and-profilerTime: ✅ 22.041ms (SLO: <23.500ms -6.2%) vs baseline: ~same Memory: ✅ 66.542MB (SLO: <67.500MB 🟡 -1.4%) vs baseline: +4.7% ✅ tracer-dont-create-db-spansTime: ✅ 19.311ms (SLO: <21.500ms 📉 -10.2%) vs baseline: -0.2% Memory: ✅ 65.460MB (SLO: <66.000MB 🟡 -0.8%) vs baseline: +4.9% ✅ tracer-minimalTime: ✅ 16.654ms (SLO: <17.500ms -4.8%) vs baseline: +0.1% Memory: ✅ 65.397MB (SLO: <66.000MB 🟡 -0.9%) vs baseline: +4.8% ✅ tracer-nativeTime: ✅ 20.462ms (SLO: <21.750ms -5.9%) vs baseline: -0.2% Memory: ✅ 71.360MB (SLO: <72.500MB 🟡 -1.6%) vs baseline: +4.8% ✅ tracer-no-cachesTime: ✅ 18.471ms (SLO: <19.650ms -6.0%) vs baseline: -0.4% Memory: ✅ 65.406MB (SLO: <67.000MB -2.4%) vs baseline: +4.8% ✅ tracer-no-databasesTime: ✅ 18.773ms (SLO: <20.100ms -6.6%) vs baseline: ~same Memory: ✅ 65.258MB (SLO: <67.000MB -2.6%) vs baseline: +4.7% ✅ tracer-no-middlewareTime: ✅ 20.155ms (SLO: <21.500ms -6.3%) vs baseline: -0.4% Memory: ✅ 65.431MB (SLO: <67.000MB -2.3%) vs baseline: +4.8% ✅ tracer-no-templatesTime: ✅ 20.293ms (SLO: <22.000ms -7.8%) vs baseline: -0.3% Memory: ✅ 65.457MB (SLO: <67.000MB -2.3%) vs baseline: +5.0% 🟡 errortrackingdjangosimple - 6/6✅ errortracking-enabled-allTime: ✅ 18.052ms (SLO: <19.850ms -9.1%) vs baseline: -0.2% Memory: ✅ 65.254MB (SLO: <66.500MB 🟡 -1.9%) vs baseline: +4.9% ✅ errortracking-enabled-userTime: ✅ 18.114ms (SLO: <19.400ms -6.6%) vs baseline: +0.5% Memory: ✅ 65.274MB (SLO: <66.500MB 🟡 -1.8%) vs baseline: +4.9% ✅ tracer-enabledTime: ✅ 18.245ms (SLO: <19.450ms -6.2%) vs baseline: +0.8% Memory: ✅ 65.235MB (SLO: <66.500MB 🟡 -1.9%) vs baseline: +4.9% 🟡 flasksimple - 18/18✅ appsec-getTime: ✅ 4.590ms (SLO: <4.750ms -3.4%) vs baseline: ~same Memory: ✅ 62.030MB (SLO: <65.000MB -4.6%) vs baseline: +5.3% ✅ appsec-postTime: ✅ 6.606ms (SLO: <6.750ms -2.1%) vs baseline: -0.2% Memory: ✅ 61.991MB (SLO: <65.000MB -4.6%) vs baseline: +4.5% ✅ appsec-telemetryTime: ✅ 4.584ms (SLO: <4.750ms -3.5%) vs baseline: -0.5% Memory: ✅ 61.991MB (SLO: <65.000MB -4.6%) vs baseline: +4.8% ✅ debuggerTime: ✅ 1.860ms (SLO: <2.000ms -7.0%) vs baseline: +0.4% Memory: ✅ 45.554MB (SLO: <47.000MB -3.1%) vs baseline: +5.1% ✅ iast-getTime: ✅ 1.865ms (SLO: <2.000ms -6.8%) vs baseline: ~same Memory: ✅ 42.389MB (SLO: <49.000MB 📉 -13.5%) vs baseline: +5.1% ✅ profilerTime: ✅ 1.914ms (SLO: <2.100ms -8.8%) vs baseline: -0.2% Memory: ✅ 46.475MB (SLO: <47.000MB 🟡 -1.1%) vs baseline: +4.7% ✅ resource-renamingTime: ✅ 3.369ms (SLO: <3.650ms -7.7%) vs baseline: -0.4% Memory: ✅ 52.180MB (SLO: <53.500MB -2.5%) vs baseline: +4.7% ✅ tracerTime: ✅ 3.364ms (SLO: <3.650ms -7.8%) vs baseline: ~same Memory: ✅ 52.278MB (SLO: <53.500MB -2.3%) vs baseline: +5.0% ✅ tracer-nativeTime: ✅ 3.370ms (SLO: <3.650ms -7.7%) vs baseline: +0.2% Memory: ✅ 58.326MB (SLO: <60.000MB -2.8%) vs baseline: +5.3% 🟡 otelspan - 22/22✅ add-eventTime: ✅ 41.982ms (SLO: <47.150ms 📉 -11.0%) vs baseline: +2.3% Memory: ✅ 44.211MB (SLO: <47.000MB -5.9%) vs baseline: +5.1% ✅ add-metricsTime: ✅ 315.746ms (SLO: <344.800ms -8.4%) vs baseline: -0.8% Memory: ✅ 617.214MB (SLO: <630.000MB -2.0%) vs baseline: +4.8% ✅ add-tagsTime: ✅ 292.387ms (SLO: <314.000ms -6.9%) vs baseline: +1.0% Memory: ✅ 618.885MB (SLO: <630.000MB 🟡 -1.8%) vs baseline: +4.8% ✅ get-contextTime: ✅ 80.819ms (SLO: <92.350ms 📉 -12.5%) vs baseline: -0.3% Memory: ✅ 39.737MB (SLO: <46.500MB 📉 -14.5%) vs baseline: +4.8% ✅ is-recordingTime: ✅ 39.061ms (SLO: <44.500ms 📉 -12.2%) vs baseline: +1.8% Memory: ✅ 43.639MB (SLO: <47.500MB -8.1%) vs baseline: +4.8% ✅ record-exceptionTime: ✅ 59.562ms (SLO: <67.650ms 📉 -12.0%) vs baseline: +1.8% Memory: ✅ 40.126MB (SLO: <47.000MB 📉 -14.6%) vs baseline: +4.9% ✅ set-statusTime: ✅ 44.382ms (SLO: <50.400ms 📉 -11.9%) vs baseline: +0.5% Memory: ✅ 43.578MB (SLO: <47.000MB -7.3%) vs baseline: +4.8% ✅ startTime: ✅ 37.949ms (SLO: <43.450ms 📉 -12.7%) vs baseline: +0.9% Memory: ✅ 43.591MB (SLO: <47.000MB -7.3%) vs baseline: +4.8% ✅ start-finishTime: ✅ 83.017ms (SLO: <88.000ms -5.7%) vs baseline: +0.2% Memory: ✅ 34.524MB (SLO: <46.500MB 📉 -25.8%) vs baseline: +4.7% ✅ start-finish-telemetryTime: ✅ 85.732ms (SLO: <89.000ms -3.7%) vs baseline: +1.9% Memory: ✅ 34.544MB (SLO: <46.500MB 📉 -25.7%) vs baseline: +5.0% ✅ update-nameTime: ✅ 39.373ms (SLO: <45.150ms 📉 -12.8%) vs baseline: +0.4% Memory: ✅ 43.889MB (SLO: <47.000MB -6.6%) vs baseline: +4.8% 🟡 span - 26/26✅ add-eventTime: ✅ 20.554ms (SLO: <22.500ms -8.6%) vs baseline: +0.1% Memory: ✅ 49.678MB (SLO: <53.000MB -6.3%) vs baseline: +4.9% ✅ add-metricsTime: ✅ 90.681ms (SLO: <93.500ms -3.0%) vs baseline: +0.7% Memory: ✅ 689.975MB (SLO: <961.000MB 📉 -28.2%) vs baseline: +4.8% ✅ add-tagsTime: ✅ 149.119ms (SLO: <155.000ms -3.8%) vs baseline: +0.7% Memory: ✅ 690.822MB (SLO: <962.500MB 📉 -28.2%) vs baseline: +4.8% ✅ get-contextTime: ✅ 18.965ms (SLO: <20.500ms -7.5%) vs baseline: +1.5% Memory: ✅ 48.436MB (SLO: <53.000MB -8.6%) vs baseline: +5.0% ✅ is-recordingTime: ✅ 18.927ms (SLO: <20.500ms -7.7%) vs baseline: -0.2% Memory: ✅ 48.474MB (SLO: <53.000MB -8.5%) vs baseline: +4.9% ✅ record-exceptionTime: ✅ 37.919ms (SLO: <40.000ms -5.2%) vs baseline: +0.3% Memory: ✅ 42.453MB (SLO: <53.000MB 📉 -19.9%) vs baseline: +4.7% ✅ set-statusTime: ✅ 20.773ms (SLO: <22.000ms -5.6%) vs baseline: -0.2% Memory: ✅ 48.456MB (SLO: <53.000MB -8.6%) vs baseline: +4.9% ✅ startTime: ✅ 18.685ms (SLO: <20.500ms -8.9%) vs baseline: -0.2% Memory: ✅ 48.403MB (SLO: <53.000MB -8.7%) vs baseline: +4.7% ✅ start-finishTime: ✅ 51.572ms (SLO: <52.500ms 🟡 -1.8%) vs baseline: ~same Memory: ✅ 32.185MB (SLO: <34.000MB -5.3%) vs baseline: +4.8% ✅ start-finish-telemetryTime: ✅ 52.788ms (SLO: <54.500ms -3.1%) vs baseline: +0.3% Memory: ✅ 32.204MB (SLO: <34.000MB -5.3%) vs baseline: +5.0% ✅ start-finish-traceid128Time: ✅ 54.787ms (SLO: <57.000ms -3.9%) vs baseline: -0.2% Memory: ✅ 32.126MB (SLO: <34.000MB -5.5%) vs baseline: +4.9% ✅ start-traceid128Time: ✅ 19.013ms (SLO: <22.500ms 📉 -15.5%) vs baseline: -0.2% Memory: ✅ 48.390MB (SLO: <53.000MB -8.7%) vs baseline: +4.9% ✅ update-nameTime: ✅ 19.558ms (SLO: <22.000ms 📉 -11.1%) vs baseline: +1.7% Memory: ✅ 49.122MB (SLO: <53.000MB -7.3%) vs baseline: +4.8%
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
release note lgtm
Description
(public change) Adds
reasoning
as an argument tosubmit_evaluation_for()
andsubmit_evaluation()
. This arg is used to denote an explanation behind the evaluation results (i.e. why was the span marked as toxic?)(internal change - not facing users) Also changes how the
assessment
field is stored on the evaluation object (#14792 added it as a nestedsuccess_criteria
object) to a top-level field on the evaluation object. This isn't breaking (since this hasn't been officially released on our product backend) nor a user-facing change.Testing
Risks
Additional Notes