feat(web-analytics): staleness-first warm order + per-team progress log#63138
Merged
Conversation
Contributor
|
Reviews (1): Last reviewed commit: "feat(web-analytics): staleness-first war..." | Re-trigger Greptile |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 2f2bd3c1fe
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
gantoine
approved these changes
Jun 11, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
The eager web-analytics warmer processes its team list in env-var order every run. If a run is cut short (it has a 90-min
max_runtimecap), the front of the list always wins and never-warmed teams at the back can be starved. There was also no in-run progress signal — you couldn't tell from the logs how far through the 86-team cohort a run was.Changes
Max(PreaggregationJob.computed_at)— least-recently-computed first, teams that have never been precomputed (no job row) to the very front. So a run that can't finish the whole list still makes progress on the teams that need it most. It's a single indexed aggregate query, not per-team.processed/total(via a thread-safeitertools.count), so a run is followable as "47/86" in Loki/Dagster.Each team's tile matrix still runs sequentially inside the (unchanged) 5-wide pool; only the order teams enter the pool changed.
How did you test this code?
I'm an agent (Claude Code). Ran the dagster test suite —
hogli test …/test_eager_web_analytics_precompute.py— 16 passed, including two new tests: staleness ordering (concurrency pinned to 1 so the pool drains in order; asserts never→old→recent), and theprocessed/totalprogress fields. No manual prod testing.🤖 Agent context
Autonomy: Human-driven (agent-assisted)
Tool: Claude Code. Decisions: ordered by
computed_at(last actually-warmed) rather thancreated_at(which can be a pending row); never-warmed sorts first; useditertools.countfor a lock-free progress counter (its__next__is atomic under the GIL); kept selection to one indexed aggregate rather than a per-team query. Follow-up to the just-merged #63044.