docs(metrics): deployment layout for local/dev/prod + schema decision#63122
docs(metrics): deployment layout for local/dev/prod + schema decision#63122DanielVisca wants to merge 4 commits into
Conversation
…board inside PostHog Foundation of the metrics-dashboard stack. Captures the gap between what products/metrics ships today (single-metric viewer + HogQL SQL tab) and what the on-call logs.json Grafana dashboard needs, the 23-PR sequence to close it, and the compounding-mistake audit that shaped the contract decisions (single-grid interval, [MetricSeries] return shape from PR1, attribute_field seam for the schema rewrite, server-side formulas, counter-reset-aware rate, bounds-validated histogram_quantile). Out of scope (separate stacks): streams-table schema rewrite, services/prom-compat Track A, prom-client→OTel SDK swap, deploy annotation automation. Generated-By: PostHog Code Task-Id: 97261763-b5cd-4739-98c0-ba2da8637abe
Foundation PR for the metrics query-layer stack. Two seams that every later PR (filters, group-by, rate, histogram_quantile) leans on: 1. attribute_field(name, scope="resource|attribute|auto") in products/metrics/backend/metric_query_runner.py — returns the HogQL AST node that accesses a metric attribute by name. "auto" prefers resource_attributes and falls back to the attributes alias (which strips the 5-char __str type tag via left(k, -5)). This is the single seam for the schema-v2 (series+samples) rewrite — when storage changes, only this function changes. Uses arrayElement() rather than map subscript: HogQL prints subscript on a StringJSONDatabaseField as JSONExtractRaw, which is illegal on the physical Map columns. 2. seed_metric() in products/metrics/backend/tests/_seeder.py — inserts deterministic metric rows with the exact production shape, including the __str attribute-key type tags the Kafka MV writes (concat(k, '__str')) — verified against real OTLP-ingested rows. Also migrates test_metric_names_query_runner off the removed _insert_metric_row helper (the previous mypy CI failure). How to validate manually: - hogli test products/metrics/backend/tests/ - seeded rows' attribute keys must match real ingested rows: SELECT mapKeys(attributes_map_str) FROM posthog.metrics LIMIT 1 shows keys ending in __str; the attributes ALIAS strips them Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Companion to dashboard-mvp.md. Captures (1) the settled schema decision — alpha ships on the current metrics1, the Snuffle streams layout is the post-alpha storage track, with attribute_field() as the migration seam; (2) a status matrix of what exists in local/dev/ prod-us/prod-eu across cloud-infra, charts, and posthog; (3) the per-repo TODO and a prod go-live checklist; (4) the local validation path for fully-functional metrics with a real logs service piped through. Resolves the schema question by reading Snuffle: its posthog schema is our metrics1 verbatim (so prom-compat reads it as-is), and its default streams schema is the documented disk-optimization target. Generated-By: PostHog Code Task-Id: 97261763-b5cd-4739-98c0-ba2da8637abe
Stable wire shape for the metric query layer, locked before endpoint or runner work so filters/group-by/rate/histogram_quantile/formula all build against one contract. Fresh on master (the merged single-metric baseline); supersedes the stale May-27 metrics-prXX rewrite. enums.py: AttributeScope, FilterOp, MetricAggregation. contracts.py: MetricFilter, MetricGroupBy, MetricQueryClause, MetricQueryRequest (shared interval grid + optional formula), MetricPoint, MetricSeries. Response is ALWAYS list[MetricSeries]. 12 contract-validation tests. Generated-By: PostHog Code Task-Id: 97261763-b5cd-4739-98c0-ba2da8637abe
Prompt To Fix All With AIFix the following 2 code review issues. Work through them one at a time, proposing concise fixes.
---
### Issue 1 of 2
products/metrics/backend/facade/contracts.py:112
**Mutable `dict` breaks the "Immutable" contract guarantee**
The module docstring advertises these types as `Immutable (frozen=True)`, but `frozen=True` on a dataclass only prevents reassigning the attribute reference — it does not prevent mutating the dict's contents. Any caller can write `series.labels["injected"] = "x"` and silently corrupt a shared series object. `points` is fine because `tuple` is immutable; `labels` is not.
Use `types.MappingProxyType` at construction time, or change the type annotation to `Mapping[str, str]` and enforce immutability by convention with a note that producers must pass a `MappingProxyType`. The latter is lighter-weight and consistent with the rest of the module.
### Issue 2 of 2
products/metrics/backend/tests/test_contracts.py:28-38
**Non-parameterised tests violate project conventions**
Both `test_quantile_aggregation_requires_quantile` (two aggregations inline) and `test_quantile_must_be_in_open_unit_interval` (a `for` loop over bad values) should use `@pytest.mark.parametrize`. A for-loop swallows the failure message after the first bad value, so the test output won't tell you which specific boundary was hit. The project rule is "We always prefer parameterised tests."
Reviews (1): Last reviewed commit: "feat(metrics): query contracts and enums..." | Re-trigger Greptile |
| which clause produced it (the formula result uses `clause="formula"`). | ||
| """ | ||
|
|
||
| labels: dict[str, str] |
There was a problem hiding this comment.
Mutable
dict breaks the "Immutable" contract guarantee
The module docstring advertises these types as Immutable (frozen=True), but frozen=True on a dataclass only prevents reassigning the attribute reference — it does not prevent mutating the dict's contents. Any caller can write series.labels["injected"] = "x" and silently corrupt a shared series object. points is fine because tuple is immutable; labels is not.
Use types.MappingProxyType at construction time, or change the type annotation to Mapping[str, str] and enforce immutability by convention with a note that producers must pass a MappingProxyType. The latter is lighter-weight and consistent with the rest of the module.
Prompt To Fix With AI
This is a comment left during a code review.
Path: products/metrics/backend/facade/contracts.py
Line: 112
Comment:
**Mutable `dict` breaks the "Immutable" contract guarantee**
The module docstring advertises these types as `Immutable (frozen=True)`, but `frozen=True` on a dataclass only prevents reassigning the attribute reference — it does not prevent mutating the dict's contents. Any caller can write `series.labels["injected"] = "x"` and silently corrupt a shared series object. `points` is fine because `tuple` is immutable; `labels` is not.
Use `types.MappingProxyType` at construction time, or change the type annotation to `Mapping[str, str]` and enforce immutability by convention with a note that producers must pass a `MappingProxyType`. The latter is lighter-weight and consistent with the rest of the module.
How can I resolve this? If you propose a fix, please make it concise.| def test_quantile_aggregation_requires_quantile(self): | ||
| with pytest.raises(ValueError): | ||
| _clause(aggregation=MetricAggregation.QUANTILE) | ||
| with pytest.raises(ValueError): | ||
| _clause(aggregation=MetricAggregation.HISTOGRAM_QUANTILE) | ||
|
|
||
| def test_quantile_must_be_in_open_unit_interval(self): | ||
| for bad in (0.0, 1.0, -0.1, 1.5): | ||
| with pytest.raises(ValueError): | ||
| _clause(aggregation=MetricAggregation.QUANTILE, quantile=bad) | ||
| assert _clause(aggregation=MetricAggregation.QUANTILE, quantile=0.95).quantile == 0.95 |
There was a problem hiding this comment.
Non-parameterised tests violate project conventions
Both test_quantile_aggregation_requires_quantile (two aggregations inline) and test_quantile_must_be_in_open_unit_interval (a for loop over bad values) should use @pytest.mark.parametrize. A for-loop swallows the failure message after the first bad value, so the test output won't tell you which specific boundary was hit. The project rule is "We always prefer parameterised tests."
Prompt To Fix With AI
This is a comment left during a code review.
Path: products/metrics/backend/tests/test_contracts.py
Line: 28-38
Comment:
**Non-parameterised tests violate project conventions**
Both `test_quantile_aggregation_requires_quantile` (two aggregations inline) and `test_quantile_must_be_in_open_unit_interval` (a `for` loop over bad values) should use `@pytest.mark.parametrize`. A for-loop swallows the failure message after the first bad value, so the test output won't tell you which specific boundary was hit. The project rule is "We always prefer parameterised tests."
How can I resolve this? If you propose a fix, please make it concise.Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
There was a problem hiding this comment.
Purely additive changes: new internal docs, frozen dataclass contracts, expanded StrEnum definitions, and matching unit tests for a metrics product in active development. No existing code is modified, no API contracts or data models are altered, and the author is on the owning team. The two bot comments are P2 quality concerns (mutable dict inside a frozen dataclass; non-parameterized tests) — valid but not showstoppers for an additive contract-definition file with no production call sites yet.
ed7d5ab to
25d790d
Compare
ce78417 to
e5ca206
Compare
New commits pushed (delta classified non_linear_history) — stamphog approval dismissed; re-review running automatically.
25d790d to
dc0f47f
Compare
e5ca206 to
2788464
Compare
dc23e07 to
ce78417
Compare
|
Docs from this PR will be published at posthog.com
Preview will be ready in ~10 minutes. Click Preview link above to access docs at |
New commits pushed (delta classified non_linear_history) — stamphog approval dismissed; re-review running automatically.
There was a problem hiding this comment.
Gates denied due to size (917 lines across 9 files). The changes are purely additive — new internal docs, frozen dataclass contracts, expanded enums, and unit tests — with no modifications to existing production code. The two bot inline comments raise valid quality concerns (mutable dict in a frozen dataclass, non-parameterized tests) but are not showstoppers. Needs a human to confirm approval given the size gate failure.

Problem
The metrics query surface is about to grow (filters, group-by, counter functions, formulas). Locking the request/response contract down as frozen dataclasses before the behavior lands means every later PR extends one shape instead of inventing its own.
Changes
Adds
facade/contracts.py+facade/enums.py(previously empty stubs):MetricFilter,MetricGroupBy,MetricQueryClause,MetricQueryRequest,MetricPoint,MetricSeries, and theAttributeScope/FilterOp/MetricAggregationenums. Framework-free, frozen, validated in__post_init__(quantile bounds, unique clause names, window ordering). Design choices worth knowing:intervallives on the request, not per clause — every series in a response shares one bucket grid, which is what makes formulas likea / bline up.[MetricSeries]; a single ungrouped query returns one series with empty labels, so consumers never branch on single-vs-multi.AttributeScope.AUTOis the default everywhere — explicitresource/attributescope stays available but the planned schema-v2 merged-label layout makesautothe forward-compatible choice.Pure types + unit tests; no behavior change.
How did you test this code?
I'm an agent. Automated:
hogli test products/metrics/backend/tests/test_contracts.py(12 tests: validation rules, immutability).🤖 Agent context
Autonomy: Human-driven (agent-assisted) — directed by @DanielVisca
Contracts were drafted by Daniel; this revision validates them against the implemented stack above it (no shape changes were needed — only the docstring promise that the formula result uses
clause="formula", which PR #63129 honors).