Skip to content

feat(web-analytics): staleness-first warm order + per-team progress log#63138

Merged
lricoy merged 4 commits into
masterfrom
lricoy/web-analytics-warmer-staleness-progress
Jun 12, 2026
Merged

feat(web-analytics): staleness-first warm order + per-team progress log#63138
lricoy merged 4 commits into
masterfrom
lricoy/web-analytics-warmer-staleness-progress

Conversation

@lricoy

@lricoy lricoy commented Jun 11, 2026

Copy link
Copy Markdown
Member

Problem

The eager web-analytics warmer processes its team list in env-var order every run. If a run is cut short (it has a 90-min max_runtime cap), the front of the list always wins and never-warmed teams at the back can be starved. There was also no in-run progress signal — you couldn't tell from the logs how far through the 86-team cohort a run was.

Changes

  • Staleness-first ordering. Sort the eligible teams by Max(PreaggregationJob.computed_at) — least-recently-computed first, teams that have never been precomputed (no job row) to the very front. So a run that can't finish the whole list still makes progress on the teams that need it most. It's a single indexed aggregate query, not per-team.
  • Per-team progress indicator. Each team's completion log now carries processed/total (via a thread-safe itertools.count), so a run is followable as "47/86" in Loki/Dagster.
  • Drive-by: the module docstring still said "28 days" — corrected to 31 (the window shipped in perf(web-analytics): warm eager-precompute teams with a thread pool #63044).

Each team's tile matrix still runs sequentially inside the (unchanged) 5-wide pool; only the order teams enter the pool changed.

How did you test this code?

I'm an agent (Claude Code). Ran the dagster test suite — hogli test …/test_eager_web_analytics_precompute.py16 passed, including two new tests: staleness ordering (concurrency pinned to 1 so the pool drains in order; asserts never→old→recent), and the processed/total progress fields. No manual prod testing.

🤖 Agent context

Autonomy: Human-driven (agent-assisted)

Tool: Claude Code. Decisions: ordered by computed_at (last actually-warmed) rather than created_at (which can be a pending row); never-warmed sorts first; used itertools.count for a lock-free progress counter (its __next__ is atomic under the GIL); kept selection to one indexed aggregate rather than a per-team query. Follow-up to the just-merged #63044.

@lricoy lricoy self-assigned this Jun 11, 2026
@greptile-apps

greptile-apps Bot commented Jun 11, 2026

Copy link
Copy Markdown
Contributor

Reviews (1): Last reviewed commit: "feat(web-analytics): staleness-first war..." | Re-trigger Greptile

@lricoy lricoy marked this pull request as ready for review June 11, 2026 22:09
@assign-reviewers-posthog assign-reviewers-posthog Bot requested a review from a team June 11, 2026 22:10
@lricoy lricoy added the stamphog Request AI review from stamphog label Jun 11, 2026

@stamphog stamphog Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gates denied this PR because it modifies a CI workflow file (.github/workflows/ci-dagster.yml), which hits the infra/CI deny-list and requires human review regardless of how small the change is.

@stamphog stamphog Bot removed the stamphog Request AI review from stamphog label Jun 11, 2026

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 2f2bd3c1fe

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread products/web_analytics/dags/eager_web_analytics_precompute.py Outdated
@lricoy lricoy enabled auto-merge (squash) June 11, 2026 23:00
@lricoy lricoy added the stamphog Request AI review from stamphog label Jun 11, 2026

@stamphog stamphog Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gates denied this PR because it modifies a CI workflow file (.github/workflows/ci-dagster.yml), which hits the infra/CI deny-list and requires human review. Additionally, the PR is classified as T2-never due to touching two areas with a feat commit.

@stamphog stamphog Bot removed the stamphog Request AI review from stamphog label Jun 11, 2026
@lricoy lricoy merged commit 4089731 into master Jun 12, 2026
326 of 338 checks passed
@lricoy lricoy deleted the lricoy/web-analytics-warmer-staleness-progress branch June 12, 2026 00:07
@deployment-status-posthog

deployment-status-posthog Bot commented Jun 12, 2026

Copy link
Copy Markdown

Deploy status

Environment Status Deployed At Workflow
dev ✅ Deployed 2026-06-12 00:51 UTC Run
prod-us ✅ Deployed 2026-06-12 01:07 UTC Run
prod-eu ✅ Deployed 2026-06-12 01:07 UTC Run

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants