Skip to content

feat(data-warehouse): add facade and route consumers through it#66213

Open
Gilbert09 wants to merge 3 commits into
masterfrom
tom/dwh-phase2-facade
Open

feat(data-warehouse): add facade and route consumers through it#66213
Gilbert09 wants to merge 3 commits into
masterfrom
tom/dwh-phase2-facade

Conversation

@Gilbert09

Copy link
Copy Markdown
Member

Problem

Phase 2 isolates data_warehouse behind a facade. The structural prereqs landed in #66128; this PR adds the facade itself and routes external consumers through it. Critically, it resolves the warehouse_sources ↔ data_warehouse cycle: warehouse_sources (isolated in Phase 1) imports data_warehouse internals (logic.data_load, s3, the direct-SQL constants, webhook consumer) — those now go through data_warehouse's facade.

Changes

  • Facade submodules (backend/facade/), split by import weight so setup-path consumers stay light:
    • sources.py — direct-SQL option constants (consumed by the setup-path warehouse_sources table model).
    • models.pyExternalDataSourceRevenueAnalyticsConfig.
    • hogql.pyget_warehouse_sync_warnings (the HogQL database builder, setup-adjacent).
    • tasks.py — the beat-scheduled digest catch-up task.
    • dags.py — the managed_viewset_sync Dagster asset module (the root dags/ channel the scan can't see).
    • api.py — the heavy operational surface (temporal schedule/workflow management, S3 helpers, schema reconciliation, job-status updates, webhook ingestion). Off the django.setup() path; its consumers are the temporal workers / import pipeline.
  • Consumer sweep — 43 files (warehouse_sources pipeline, posthog/temporal, data_modeling, endpoints, signals, hogql, dags, admin) repointed off data_warehouse internals onto the facade. Identity-preserving.
  • Boundarydata_warehouse added to the canonical [[interfaces]] block (tach check --interfaces now enforces facade-only access) + a small legacy-leak for the test helpers core still imports.

Deferred (future Phase 2 PRs)

  • Presentation wave — the 16 views keep their import-linter allowances (auto-added by the structural move) and still reach internals; thinning them to use the facade is the next PR.
  • Contracts — the external surface here is operational wiring (re-exports), so facade/api.py is re-exports, not model→contract functions (product:lint notes this as a non-blocking warning in lenient mode). Contract-data functions + contracts.py come with the presentation wave.
  • The test-helper legacy-leak shrinks when those fixtures relocate; backend:contract-check is the final flip once all leaks are gone.

How did you test this code?

Automated, run locally:

  • tach check --dependencies --interfacesAll modules validated (data_warehouse enforced as an interface; the Dagster channel rerouted through facade.dags).
  • lint-imports → presentation contract KEPT.
  • hogli product:lint data_warehouse → exit 0 (expected lenient-mode warnings: legacy-leak + re-export facade).
  • Startup import-budget → pass (the light/heavy facade split keeps table.py/database.py off the heavy temporal/s3 path).
  • ruff + py_compile clean on all 52 changed files.

Behavioral product tests run in CI (the sweep is identity-preserving).

Docs update

No user-facing docs changes.

Light submodules (facade.sources constants, facade.models, facade.hogql sync-status, facade.tasks beat task, facade.dags Dagster assets) + heavy facade.api (temporal schedule/s3/reconcile/webhook ops, off the setup path). Adds data_warehouse to the canonical [[interfaces]] block + a legacy-leak for the test helpers. Resolves the ws<->dw cycle by routing it through the facade.
@Gilbert09 Gilbert09 requested review from a team as code owners June 25, 2026 21:11
@Gilbert09 Gilbert09 self-assigned this Jun 25, 2026
@assign-reviewers-posthog assign-reviewers-posthog Bot requested review from a team June 25, 2026 21:11
@assign-reviewers-posthog

Copy link
Copy Markdown

👀 Auto-assigned reviewers

These soft owners were skipped because they only have minor changes here. Nothing blocks merge, so self-assign if you'd like a look:

  • @PostHog/team-self-driving (products/signals/**)
  • @PostHog/team-data-tools (posthog/hogql/**)

Soft owners come from CODEOWNERS-soft and each product's product.yaml. Generated files and lockfiles are ignored when deciding ownership.

@greptile-apps

greptile-apps Bot commented Jun 25, 2026

Copy link
Copy Markdown
Contributor

Reviews (1): Last reviewed commit: "feat(data-warehouse): add facade and enf..." | Re-trigger Greptile

facade.api eagerly imported logic.data_load -> warehouse_sources -> facade.api (circular import via the ws<->dw cycle). Make it a PEP 562 lazy aggregator so names resolve on first access, breaking the import-time cycle (and keeping it off the setup path). Move data_warehouse to a dedicated [[interfaces]] block (the canonical alternation requires facade/contracts.py, which the operational-wiring facade doesn't have yet).
@github-actions

github-actions Bot commented Jun 25, 2026

Copy link
Copy Markdown
Contributor

🎭 Playwright report · View test results →

⚠️ 1 flaky test:

  • View persons list, navigate to detail, and browse tabs (chromium)

These issues are not necessarily caused by your changes.
Annoyed by this comment? Help fix flakies and failures and it'll disappear!

@tests-posthog

tests-posthog Bot commented Jun 25, 2026

Copy link
Copy Markdown
Contributor

Query snapshots: Backend query snapshots updated

Changes: 1 snapshots (1 modified, 0 added, 0 deleted)

What this means:

  • Query snapshots have been automatically updated to match current output
  • These changes reflect modifications to database queries or schema

Next steps:

  • Review the query changes to ensure they're intentional
  • If unexpected, investigate what caused the query to change

Review snapshot changes →

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants