feat(metrics): counter-aware rate and increase aggregations (P5)#63127
feat(metrics): counter-aware rate and increase aggregations (P5)#63127DanielVisca wants to merge 1 commit into
Conversation
Prompt To Fix All With AIFix the following 2 code review issues. Work through them one at a time, proposing concise fixes.
---
### Issue 1 of 2
products/metrics/backend/tests/test_metric_query_runner.py:693
The method name says `rate` but the call inside passes `"increase"` to `_run`. If this test is meant to cover `rate` + group_by, the assertion values (60.0 / 6.0) correspond to `increase` (no division by `step_seconds`), so the body and the name contradict each other. The name should match the aggregation actually exercised.
```suggestion
def test_increase_composes_with_group_by(self):
```
### Issue 2 of 2
products/metrics/backend/tests/test_metric_query_runner.py:715-752
`test_rate_via_facade_and_api` exercises two unrelated code paths — RATE through the facade and INCREASE through the HTTP endpoint — in a single test method with a shared seed. If the test fails it is not immediately clear which path broke, and the mixed aggregation types (`RATE` in the facade half, `"increase"` in the API half) make the intent harder to read. Per the project's preference for focused, parameterised tests, splitting this into two separate methods would make failures self-documenting.
Reviews (1): Last reviewed commit: "feat(metrics): counter-aware rate and in..." | Re-trigger Greptile |
|
Size Change: 0 B Total Size: 72.4 MB ℹ️ View Unchanged
|
| rows = self._run("increase") | ||
| self.assertEqual([row["value"] for row in rows], [15.0]) | ||
|
|
||
| def test_rate_composes_with_group_by(self): |
There was a problem hiding this comment.
The method name says
rate but the call inside passes "increase" to _run. If this test is meant to cover rate + group_by, the assertion values (60.0 / 6.0) correspond to increase (no division by step_seconds), so the body and the name contradict each other. The name should match the aggregation actually exercised.
| def test_rate_composes_with_group_by(self): | |
| def test_increase_composes_with_group_by(self): |
Prompt To Fix With AI
This is a comment left during a code review.
Path: products/metrics/backend/tests/test_metric_query_runner.py
Line: 693
Comment:
The method name says `rate` but the call inside passes `"increase"` to `_run`. If this test is meant to cover `rate` + group_by, the assertion values (60.0 / 6.0) correspond to `increase` (no division by `step_seconds`), so the body and the name contradict each other. The name should match the aggregation actually exercised.
```suggestion
def test_increase_composes_with_group_by(self):
```
How can I resolve this? If you propose a fix, please make it concise.| def test_rate_via_facade_and_api(self): | ||
| self._seed_counter( | ||
| [ | ||
| (self.anchor + dt.timedelta(seconds=0), 0.0), | ||
| (self.anchor + dt.timedelta(seconds=30), 30.0), | ||
| ] | ||
| ) | ||
| series = run_metric_query( | ||
| team=self.team, | ||
| request=MetricQueryRequest( | ||
| clauses=( | ||
| MetricQueryClause(name="a", metric_name="requests_total", aggregation=MetricAggregation.RATE), | ||
| ), | ||
| date_from=self.anchor - dt.timedelta(minutes=1), | ||
| date_to=self.anchor + dt.timedelta(minutes=2), | ||
| interval="minute", | ||
| ), | ||
| ) | ||
| self.assertEqual([p.value for p in series[0].points], [0.5]) | ||
|
|
||
| response = self.client.post( | ||
| f"/api/projects/{self.team.id}/metrics/query", | ||
| data={ | ||
| "query": { | ||
| "metricName": "requests_total", | ||
| "aggregation": "increase", | ||
| "interval": "minute", | ||
| "dateFrom": (self.anchor - dt.timedelta(minutes=1)).isoformat(), | ||
| "dateTo": (self.anchor + dt.timedelta(minutes=2)).isoformat(), | ||
| } | ||
| }, | ||
| content_type="application/json", | ||
| ) | ||
| self.assertEqual(response.status_code, status.HTTP_200_OK) | ||
| self.assertEqual( | ||
| [p["value"] for p in response.json()["results"][0]["points"]], | ||
| [30.0], | ||
| ) |
There was a problem hiding this comment.
test_rate_via_facade_and_api exercises two unrelated code paths — RATE through the facade and INCREASE through the HTTP endpoint — in a single test method with a shared seed. If the test fails it is not immediately clear which path broke, and the mixed aggregation types (RATE in the facade half, "increase" in the API half) make the intent harder to read. Per the project's preference for focused, parameterised tests, splitting this into two separate methods would make failures self-documenting.
Prompt To Fix With AI
This is a comment left during a code review.
Path: products/metrics/backend/tests/test_metric_query_runner.py
Line: 715-752
Comment:
`test_rate_via_facade_and_api` exercises two unrelated code paths — RATE through the facade and INCREASE through the HTTP endpoint — in a single test method with a shared seed. If the test fails it is not immediately clear which path broke, and the mixed aggregation types (`RATE` in the facade half, `"increase"` in the API half) make the intent harder to read. Per the project's preference for focused, parameterised tests, splitting this into two separate methods would make failures self-documenting.
How can I resolve this? If you propose a fix, please make it concise.Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
d70e0ae to
22d92bf
Compare
b83d23a to
03043f0
Compare
22d92bf to
7762afe
Compare
03043f0 to
82bcef4
Compare
There was a problem hiding this comment.
Additive feature from the owning team; logic is correct (Prometheus-style counter-reset handling, temporality awareness, proper SQL placeholders, group_by wiring passes through). The bot's inline notes about a misnamed test method (test_rate_composes_with_group_by calling "increase") are a naming clarity issue only — the assertion values are correct for the aggregation under test and no production behavior is affected.
🕸️ Eager graphHow much code each root forces the browser to download and decode through static imports — the regression class total bundle size can't see.
✅ Largest files eagerly reachable from
|
| Size | File |
|---|---|
| 878.2 KiB | src/styles/global.scss |
| 609.0 KiB | public/hedgehog/burning-money-hog.png |
| 541.9 KiB | public/hedgehog/waving-hog.png |
| 448.2 KiB | public/hedgehog/stop-sign-hog.png |
| 362.0 KiB | public/hedgehog/phone-pair-hogs.png |
| 354.8 KiB | ../node_modules/.pnpm/@posthog+icons@0.36.6_react-dom@18.3.1_react@18.3.1__react@18.3.1/node_modules/@posthog/icons/dist/posthog-icons.es.js |
| 335.6 KiB | public/hedgehog/desk-hog.png |
| 323.2 KiB | public/hedgehog/3-bears-hogs.png |
| 297.4 KiB | src/taxonomy/core-filter-definitions-by-group.json |
| 285.8 KiB | src/lib/api.ts |
Largest files eagerly reachable from src/scenes/AuthenticatedShell.tsx
| Size | File |
|---|---|
| 878.2 KiB | src/styles/global.scss |
| 760.0 KiB | src/queries/validators.js |
| 609.0 KiB | public/hedgehog/burning-money-hog.png |
| 541.9 KiB | public/hedgehog/waving-hog.png |
| 448.2 KiB | public/hedgehog/stop-sign-hog.png |
| 398.7 KiB | ../node_modules/.pnpm/chart.js@4.5.1/node_modules/chart.js/dist/chart.js |
| 362.0 KiB | public/hedgehog/phone-pair-hogs.png |
| 354.8 KiB | ../node_modules/.pnpm/@posthog+icons@0.36.6_react-dom@18.3.1_react@18.3.1__react@18.3.1/node_modules/@posthog/icons/dist/posthog-icons.es.js |
| 335.6 KiB | public/hedgehog/desk-hog.png |
| 323.2 KiB | public/hedgehog/3-bears-hogs.png |
Posted automatically by check-eager-graph · sizes are input-source bytes from the esbuild metafile · part of #32479
rate (per-second) and increase compute per-underlying-series deltas via
lagInFrame partitioned by (service_name, resource_fingerprint,
attributes), Prometheus-style:
- cumulative temporality: delta clamped for counter resets (a drop means
the counter restarted, so the post-reset absolute value is the
increase); the first sample of a series contributes 0
- delta temporality: each sample already is the increase
- naive global diffing is explicitly tested against (interleaved series)
How to validate manually:
- hogli test products/metrics/backend/tests/ (81 tests)
- with the local pipe running: POST /metrics/query with
{"metricName": "metrics_ingestion_bytes_received_total",
"aggregation": "rate", "interval": "minute_5"} returns a steady
positive per-second rate (the pipe ingests ~9KB/s of itself)
7762afe to
b3943ad
Compare
82bcef4 to
7c8dc60
Compare
New commits pushed (delta classified non_linear_history) — stamphog approval dismissed; re-review running automatically.
There was a problem hiding this comment.
No showstoppers. Additive feature from the owning team — new rate/increase aggregations with correct Prometheus-style counter-reset logic, parameterized SQL (no injection risk), consistent updates across backend, generated frontend types, and MCP schema. The unresolved bot comments flag test naming mismatches only (test_rate_composes_with_group_by calls "increase", test_rate_via_facade_and_api mixes two aggregations in one test), which are naming clarity issues that don't affect production behavior or correctness.

Problem
Counters are the most common server metric, and raw cumulative counter values are meaningless — you need
rate/increase. Doing this wrong (naive global diffing, ignoring counter resets or delta temporality) silently produces garbage, which is worse than not having it.Changes
rate(per-second) andincreaseaggregations. Per-underlying-series deltas vialagInFramepartitioned by(service_name, resource_fingerprint, attributes), Prometheus-style:aggregation_temporality = 'delta'): each sample already is the increase and counts as-is.rate = increase / bucket_seconds.How did you test this code?
I'm an agent. Automated tests pin exact values against real ClickHouse for: steady counters, counter resets, delta temporality, and interleaved multi-series data where naive global diffing would produce garbage (two pods at wildly different counter offsets → per-pod increases come out exact). Live check with the local pipe:
🤖 Agent context
Autonomy: Human-driven (agent-assisted) — directed by @DanielVisca
Counter-reset rule follows Prometheus (
value < prev⇒ contribution = value), implemented as amultiIfover a window function rather thanrunningDifference(deprecated, and not partition-safe).