feat(metrics): histogram_quantile from OTel explicit-bounds buckets (P6)#63128
Conversation
| min_value=0.0, | ||
| max_value=1.0, |
There was a problem hiding this comment.
Validation mismatch will cause runtime errors. The serializer allows quantile=0.0 and quantile=1.0 (inclusive bounds), but MetricQueryRunner validation at line 205 requires 0.0 < quantile < 1.0 (exclusive bounds). API requests with quantile=0.0 or quantile=1.0 will pass serializer validation but fail at runtime.
min_value=0.0, # This should be exclusive, not inclusive
max_value=1.0, # This should be exclusive, not inclusiveFix by adding exclusive validation or custom validator to match the runner's (0, 1) requirement.
| min_value=0.0, | |
| max_value=1.0, | |
| min_value=1e-10, | |
| max_value=1.0 - 1e-10, | |
Spotted by Graphite
Is this helpful? React 👍 or 👎 to let us know.
|
Size Change: 0 B Total Size: 72.4 MB ℹ️ View Unchanged
|
Prompt To Fix All With AIFix the following 2 code review issues. Work through them one at a time, proposing concise fixes.
---
### Issue 1 of 2
products/metrics/backend/tests/test_metric_query_runner.py:760
**Missing `TestCase` inheritance may silently skip these tests**
`TestHistogramQuantileInterpolation` does not inherit from `unittest.TestCase`. Django's test runner (`manage.py test`) only discovers `TestCase` subclasses, so the five `@parameterized.expand` cases covering the core interpolation logic may never run outside pytest. Every other test class in this file inherits from either `ClickhouseTestMixin, APIBaseTest` or `TestCase`. Adding `unittest.TestCase` as a base keeps discovery consistent across both runners at zero cost.
### Issue 2 of 2
products/metrics/backend/presentation/api.py:75-81
**Serializer accepts boundary values that the runner rejects**
The serializer declares `min_value=0.0, max_value=1.0` (inclusive), but `MetricQueryRunner` requires a strictly open interval `0.0 < q < 1.0`. Sending `quantile=0.0` or `quantile=1.0` passes the serializer yet reaches the runner, which raises a `ValueError` (caught as a `ParseError` → 400). The help text already documents `(0, 1)`, so aligning the validator with that contract avoids the misleading accepted-then-rejected behaviour.
```suggestion
quantile = serializers.FloatField(
required=False,
allow_null=True,
min_value=0.000001,
max_value=0.999999,
help_text="Quantile in (0, 1) for 'histogram_quantile' (e.g. 0.95). Ignored for other aggregations.",
)
```
Reviews (1): Last reviewed commit: "feat(metrics): histogram_quantile from O..." | Re-trigger Greptile |
| ) | ||
|
|
||
|
|
||
| class TestHistogramQuantileInterpolation: |
There was a problem hiding this comment.
Missing
TestCase inheritance may silently skip these tests
TestHistogramQuantileInterpolation does not inherit from unittest.TestCase. Django's test runner (manage.py test) only discovers TestCase subclasses, so the five @parameterized.expand cases covering the core interpolation logic may never run outside pytest. Every other test class in this file inherits from either ClickhouseTestMixin, APIBaseTest or TestCase. Adding unittest.TestCase as a base keeps discovery consistent across both runners at zero cost.
Prompt To Fix With AI
This is a comment left during a code review.
Path: products/metrics/backend/tests/test_metric_query_runner.py
Line: 760
Comment:
**Missing `TestCase` inheritance may silently skip these tests**
`TestHistogramQuantileInterpolation` does not inherit from `unittest.TestCase`. Django's test runner (`manage.py test`) only discovers `TestCase` subclasses, so the five `@parameterized.expand` cases covering the core interpolation logic may never run outside pytest. Every other test class in this file inherits from either `ClickhouseTestMixin, APIBaseTest` or `TestCase`. Adding `unittest.TestCase` as a base keeps discovery consistent across both runners at zero cost.
How can I resolve this? If you propose a fix, please make it concise.Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
| quantile = serializers.FloatField( | ||
| required=False, | ||
| allow_null=True, | ||
| min_value=0.0, | ||
| max_value=1.0, | ||
| help_text="Quantile in (0, 1) for 'histogram_quantile' (e.g. 0.95). Ignored for other aggregations.", | ||
| ) |
There was a problem hiding this comment.
Serializer accepts boundary values that the runner rejects
The serializer declares min_value=0.0, max_value=1.0 (inclusive), but MetricQueryRunner requires a strictly open interval 0.0 < q < 1.0. Sending quantile=0.0 or quantile=1.0 passes the serializer yet reaches the runner, which raises a ValueError (caught as a ParseError → 400). The help text already documents (0, 1), so aligning the validator with that contract avoids the misleading accepted-then-rejected behaviour.
| quantile = serializers.FloatField( | |
| required=False, | |
| allow_null=True, | |
| min_value=0.0, | |
| max_value=1.0, | |
| help_text="Quantile in (0, 1) for 'histogram_quantile' (e.g. 0.95). Ignored for other aggregations.", | |
| ) | |
| quantile = serializers.FloatField( | |
| required=False, | |
| allow_null=True, | |
| min_value=0.000001, | |
| max_value=0.999999, | |
| help_text="Quantile in (0, 1) for 'histogram_quantile' (e.g. 0.95). Ignored for other aggregations.", | |
| ) |
Prompt To Fix With AI
This is a comment left during a code review.
Path: products/metrics/backend/presentation/api.py
Line: 75-81
Comment:
**Serializer accepts boundary values that the runner rejects**
The serializer declares `min_value=0.0, max_value=1.0` (inclusive), but `MetricQueryRunner` requires a strictly open interval `0.0 < q < 1.0`. Sending `quantile=0.0` or `quantile=1.0` passes the serializer yet reaches the runner, which raises a `ValueError` (caught as a `ParseError` → 400). The help text already documents `(0, 1)`, so aligning the validator with that contract avoids the misleading accepted-then-rejected behaviour.
```suggestion
quantile = serializers.FloatField(
required=False,
allow_null=True,
min_value=0.000001,
max_value=0.999999,
help_text="Quantile in (0, 1) for 'histogram_quantile' (e.g. 0.95). Ignored for other aggregations.",
)
```
How can I resolve this? If you propose a fix, please make it concise.ef15c85 to
dbb7c8e
Compare
03043f0 to
82bcef4
Compare
dbb7c8e to
3de34c7
Compare
There was a problem hiding this comment.
Two substantive unresolved inline comments remain in the current diff. The serializer declares inclusive bounds (min_value=0.0, max_value=1.0) while the runner enforces an exclusive interval (0.0 < quantile < 1.0), creating a broken API contract where boundary values pass serializer validation but fail at runtime.
🕸️ Eager graphHow much code each root forces the browser to download and decode through static imports — the regression class total bundle size can't see.
✅ Largest files eagerly reachable from
|
| Size | File |
|---|---|
| 878.2 KiB | src/styles/global.scss |
| 609.0 KiB | public/hedgehog/burning-money-hog.png |
| 541.9 KiB | public/hedgehog/waving-hog.png |
| 448.2 KiB | public/hedgehog/stop-sign-hog.png |
| 362.0 KiB | public/hedgehog/phone-pair-hogs.png |
| 354.8 KiB | ../node_modules/.pnpm/@posthog+icons@0.36.6_react-dom@18.3.1_react@18.3.1__react@18.3.1/node_modules/@posthog/icons/dist/posthog-icons.es.js |
| 335.6 KiB | public/hedgehog/desk-hog.png |
| 323.2 KiB | public/hedgehog/3-bears-hogs.png |
| 297.4 KiB | src/taxonomy/core-filter-definitions-by-group.json |
| 285.8 KiB | src/lib/api.ts |
Largest files eagerly reachable from src/scenes/AuthenticatedShell.tsx
| Size | File |
|---|---|
| 878.2 KiB | src/styles/global.scss |
| 760.0 KiB | src/queries/validators.js |
| 609.0 KiB | public/hedgehog/burning-money-hog.png |
| 541.9 KiB | public/hedgehog/waving-hog.png |
| 448.2 KiB | public/hedgehog/stop-sign-hog.png |
| 398.7 KiB | ../node_modules/.pnpm/chart.js@4.5.1/node_modules/chart.js/dist/chart.js |
| 362.0 KiB | public/hedgehog/phone-pair-hogs.png |
| 354.8 KiB | ../node_modules/.pnpm/@posthog+icons@0.36.6_react-dom@18.3.1_react@18.3.1__react@18.3.1/node_modules/@posthog/icons/dist/posthog-icons.es.js |
| 335.6 KiB | public/hedgehog/desk-hog.png |
| 323.2 KiB | public/hedgehog/3-bears-hogs.png |
Posted automatically by check-eager-graph · sizes are input-source bytes from the esbuild metafile · part of #32479
ClickHouse sums per-le bucket-count distributions per time bucket
(reusing the per-series temporality/reset handling from rate/increase,
applied element-wise to the counts arrays); the Prometheus-style linear
interpolation happens in Python where it is exact and unit-tested.
Mixed bucket layouts in one selection are rejected with a 400 telling
the caller to narrow with filters (bounds-equality validation happens
in the query via groupUniqArray, not on a sampled row).
How to validate manually:
- hogli test products/metrics/backend/tests/ (92 tests)
- bin/send-dev-metrics.sh seeds an e2e histogram; then POST
/metrics/query with {"aggregation": "histogram_quantile",
"quantile": 0.95, "metricName": "e2e_dev_test_histogram"}
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
82bcef4 to
7c8dc60
Compare
3de34c7 to
5458194
Compare

Problem
Latency questions are quantile questions, and OTel histograms arrive as explicit-bounds bucket arrays that nothing could read yet (
histogram_bounds/histogram_countswere write-only columns).Changes
histogram_quantileaggregation (+quantilefield, e.g. 0.95):sumForEach), reusing the per-series temporality/reset handling from rate/increase applied element-wise to the counts arrays.groupUniqArrayacross all contributing rows —any()sampling would miss a mismatch inside a single time bucket (a bug the tests caught).How did you test this code?
I'm an agent. Automated: interpolation unit tests with hand-computed expectations, plus ClickHouse-backed tests for delta and cumulative histograms, bounds-mismatch rejection, and the API path. Live check:
bin/send-dev-metrics.shseeds an e2e histogram, then query it with{"aggregation": "histogram_quantile", "quantile": 0.95}.🤖 Agent context
Autonomy: Human-driven (agent-assisted) — directed by @DanielVisca
CH computes distributions, Python interpolates — chosen over a pure-SQL quantile (array gymnastics, hard to verify) and over ClickHouse's approximate quantile functions (wrong tool: the data is already bucketed).