Skip to content

Commit 5378f47

Browse files
committed
Wire GitHub push webhook to dashboard_sync enqueue
Close the live-update path for GitHub-backed dashboard templates: a new POST /docverse/webhooks/github endpoint verifies the HMAC, parses the push payload, and enqueues one dashboard_sync job per binding whose root_path was actually touched by the push. Truncated payloads fall back to the GitHub compare API; mis-signed deliveries get 401; deployments without GitHub App secrets configured see a 404. Key decisions: - Route HMAC validation through gidgethub.sansio.Event.from_http and dispatch through gidgethub.routing.Router rather than safir.github (which only exposes Pydantic webhook payload models — no router or HMAC helper, despite the issue body's hopeful "or the equivalent registration API" wording). The gidgethub router is held at module scope: registering a fresh router per request would defeat its registration model. Keeps the handler aligned with Times Square / Semaphore and avoids hand-rolling HMAC, satisfying user story 23. - Feature-disabled (no GitHub App config) returns 404, not 503. Caught at handler entry by routing both ``get_github_webhook_secret`` and ``create_push_event_processor`` through ``GitHubAppNotConfiguredError``. The 503 used by the org-admin binding endpoints (#235) would surface as a permanent GitHub redelivery loop on a misconfigured deployment; 404 hides the URL until secrets are wired and lets GitHub stop retrying. - Added ``Factory.get_github_webhook_secret`` (public accessor) so the webhook handler does not reach into ``_github_webhook_secret`` directly. Mirrors ``create_github_app_client``: validates that all three secrets are set and raises the same ``GitHubAppNotConfiguredError`` so the handler's single except-clause covers both paths. - ``PushEventProcessor.process`` returns ``[]`` rather than raising on no-op cases (no bindings match repo/ref; no binding root_paths intersect changed paths; truncated payload with no before/after). The webhook handler's surrounding ``session.begin()`` + commit still runs on the empty path because logging the no-op outcome is cheap and the empty enqueue list is the natural success result — the Slack-integration loop in ``main.py`` makes the same trade. - ``_root_path_matches`` normalises ``root_path`` exactly the way ``GitHubTreeFetcher._normalize_root`` does: ``"/"`` strips to ``""`` and matches any non-empty changed-path set, while ``"templates/blue"`` matches paths that start with ``"templates/blue/"`` or equal the bare directory entry. Keeps the webhook filter and the syncer's filter in lockstep so a binding never gets a sync enqueued for a path the syncer would then have ignored as out-of-scope. - ``DashboardGitHubTemplateBindingStore.list_by_repo_ref`` is keyed by name (``github_owner``/``github_repo``/``github_ref``), not by numeric ID. Numeric-ID lookup is task #240's job; this slice matches by the same composite name index (``idx_dashboard_github_template_bindings_repo_ref``) that the PRD identifies as the push-lookup index for v1, with the rename- robustness ID-preferential lookup deferred per the PRD's "GitHub identity stability" section. - The webhook handler's gidgethub router is registered with a single ``"push"`` callback. ``ping`` and any other event types arrive, signature-verify, and return 200 with no enqueue — gidgethub's ``dispatch`` is a fan-out to a possibly empty callback set, so unrelated events naturally no-op. This satisfies user story 23 ("does not hand-roll HMAC or event parsing") and keeps GitHub's redelivery machinery quiet for events we deliberately ignore. - Test fixture ``github_app_enabled`` toggles the three GitHub App secrets on the autouse ``context_dependency`` for the lifetime of one test and restores them on teardown. The default-disabled state preserved between tests is what makes the 404-feature-disabled test trivial: it just hits the URL with no fixture and asserts on the absence of GitHub config. Next-iteration notes: - Slice #240/#242 (rename-robustness) will add an ID-preferential binding lookup alongside ``list_by_repo_ref``. The push processor can then consult ``list_by_repo_id_ref`` first and fall back to the name-based path for un-synced bindings. The handler's plumbing does not need to change — only the processor's binding-lookup step. - The handler currently registers only ``push`` events. Adding ``installation`` (created/deleted/suspend/unsuspend) and ``repository.renamed`` / ``repository.transferred`` / ``organization.renamed`` event handlers is task #242's scope; they plug into the same ``_event_router.register(...)`` decorator, so no further wiring change is needed. The ``create_push_event_ processor`` factory method will likely sprout sibling ``create_installation_event_processor`` / ``create_rename_event_processor`` methods. - Squarebot/Kafka event delivery (PRD §"Out of Scope") still goes through this same ``PushEventProcessor`` if it ever lands — the only thing the Squarebot path would change is the request entry point (a Kafka consumer instead of a FastAPI endpoint). The processor itself takes a raw payload dict, so swapping the input layer is a tracer-bullet exercise. - Payload field extraction is duck-typed off the raw push dict (``payload["repository"]["owner"]["login"]`` etc.) rather than going through ``safir.github.GitHubPushEventModel``. The model only carries ``repository`` / ``installation`` / ``ref`` / ``before`` / ``after`` — it doesn't include the ``commits`` array that ``extract_changed_paths_from_push`` reads — so once we have to drop into the dict for ``commits`` and ``size`` anyway, parsing the rest off the dict is the simpler shape. A future refactor could parse a richer model and pass it through; the dict path is the minimum needed today. Closes #238
1 parent 8da2ae1 commit 5378f47

10 files changed

Lines changed: 1260 additions & 0 deletions

File tree

src/docverse/factory.py

Lines changed: 47 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -21,6 +21,7 @@
2121
DashboardSyncEnqueuer,
2222
DashboardTemplateBindingService,
2323
DashboardTemplateSyncer,
24+
PushEventProcessor,
2425
TemplateResolver,
2526
)
2627
from .services.edition import EditionService
@@ -416,6 +417,52 @@ def create_dashboard_template_syncer(self) -> DashboardTemplateSyncer:
416417
logger=self._logger,
417418
)
418419

420+
def get_github_webhook_secret(self) -> str:
421+
"""Return the configured GitHub webhook HMAC secret.
422+
423+
Callers (the webhook handler) need the secret to verify the
424+
``x-hub-signature-256`` header on each delivery. Routes
425+
through this accessor instead of reading the private
426+
attribute directly so the GitHub App's "all-or-nothing"
427+
invariant is enforced in one place.
428+
429+
Raises
430+
------
431+
GitHubAppNotConfiguredError
432+
If any of the three GitHub App secrets is unset.
433+
"""
434+
if (
435+
self._github_app_id is None
436+
or self._github_app_private_key is None
437+
or self._github_webhook_secret is None
438+
):
439+
msg = "GitHub App is not configured"
440+
raise GitHubAppNotConfiguredError(msg)
441+
return self._github_webhook_secret.get_secret_value()
442+
443+
def create_push_event_processor(self) -> PushEventProcessor:
444+
"""Create a :class:`PushEventProcessor` for the webhook handler.
445+
446+
Raises
447+
------
448+
GitHubAppNotConfiguredError
449+
If the GitHub App feature is not configured. The webhook
450+
handler translates this into a 404 response so the feature
451+
stays cleanly disabled when secrets are unset.
452+
RuntimeError
453+
If the shared HTTP client is not configured.
454+
"""
455+
if self._http_client is None:
456+
msg = "HTTP client is required to build a PushEventProcessor"
457+
raise RuntimeError(msg)
458+
return PushEventProcessor(
459+
binding_store=self.create_dashboard_github_template_binding_store(),
460+
enqueuer=self.create_dashboard_sync_enqueuer(),
461+
app_client=self.create_github_app_client(),
462+
http_client=self._http_client,
463+
logger=self._logger,
464+
)
465+
419466
def create_dashboard_publisher(self) -> DashboardPublisher:
420467
"""Create a DashboardPublisher for one render.
421468
Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
"""HTTP handlers for GitHub webhooks."""
2+
3+
from __future__ import annotations
4+
5+
from .github import router as webhook_router
6+
7+
__all__ = ["webhook_router"]
Lines changed: 129 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,129 @@
1+
"""GitHub webhook endpoint dispatching push events to the sync queue."""
2+
3+
from __future__ import annotations
4+
5+
from typing import Annotated
6+
7+
import gidgethub
8+
from fastapi import APIRouter, Depends, HTTPException, Request, status
9+
from gidgethub import sansio
10+
from gidgethub.routing import Router as GidgethubRouter
11+
12+
from docverse.dependencies.context import RequestContext, context_dependency
13+
from docverse.services.dashboard_templates import PushEventProcessor
14+
from docverse.storage.github import GitHubAppNotConfiguredError
15+
16+
__all__ = ["router"]
17+
18+
router = APIRouter(include_in_schema=False)
19+
"""FastAPI router for GitHub webhook endpoints.
20+
21+
Mounted under ``config.path_prefix`` from ``main.py`` so the public
22+
URL is ``POST {path_prefix}/webhooks/github``. Excluded from the
23+
OpenAPI schema because the API surface is GitHub's webhook contract,
24+
not the Docverse REST API.
25+
"""
26+
27+
28+
_event_router = GidgethubRouter()
29+
"""Module-level gidgethub router for event-type dispatch.
30+
31+
Using a fresh router per request would defeat gidgethub's intended
32+
registration model and force every callback to re-bind on every
33+
delivery. Holding it at module scope mirrors the pattern in
34+
``safir.github`` and Times Square / Semaphore: register handlers
35+
once, dispatch many.
36+
"""
37+
38+
39+
@_event_router.register("push")
40+
async def _handle_push(
41+
event: sansio.Event,
42+
*,
43+
processor: PushEventProcessor,
44+
context: RequestContext,
45+
) -> None:
46+
"""Translate a push event into ``dashboard_sync`` enqueues.
47+
48+
The processor owns transaction-free DB writes through the
49+
enqueuer; the handler wraps both the binding lookup and the
50+
enqueue in a single ``session.begin()`` so a failure aborts the
51+
whole webhook delivery cleanly.
52+
"""
53+
async with context.session.begin():
54+
jobs = await processor.process(event.data)
55+
await context.session.commit()
56+
context.logger.info(
57+
"Processed push webhook",
58+
delivery_id=event.delivery_id,
59+
enqueued=len(jobs),
60+
)
61+
62+
63+
@router.post(
64+
"/webhooks/github",
65+
status_code=status.HTTP_200_OK,
66+
summary="GitHub App webhook receiver",
67+
name="github_webhook",
68+
)
69+
async def post_github_webhook(
70+
request: Request,
71+
context: Annotated[RequestContext, Depends(context_dependency)],
72+
) -> dict[str, str]:
73+
"""Receive a GitHub webhook delivery and dispatch by event type.
74+
75+
Returns ``404`` when the GitHub App feature is not configured
76+
(any of ``github_app_id`` / ``github_app_private_key`` /
77+
``github_webhook_secret`` unset). This keeps the URL effectively
78+
invisible to a misconfigured deployment without surfacing a 5xx
79+
that would page operators on every GitHub redelivery attempt.
80+
81+
Returns ``401`` when the request is unsigned or the HMAC does
82+
not match the configured webhook secret. ``415`` is returned by
83+
``gidgethub`` directly when the content-type is wrong.
84+
85+
Returns ``200`` for all signed deliveries — including event
86+
types this app does not subscribe to — so GitHub's redelivery
87+
machinery does not retry deliveries we have intentionally
88+
chosen not to act on.
89+
"""
90+
try:
91+
secret = context.factory.get_github_webhook_secret()
92+
processor = context.factory.create_push_event_processor()
93+
except GitHubAppNotConfiguredError as exc:
94+
raise HTTPException(
95+
status_code=status.HTTP_404_NOT_FOUND,
96+
detail="GitHub App is not configured",
97+
) from exc
98+
99+
body = await request.body()
100+
try:
101+
event = sansio.Event.from_http(
102+
_lowercase_headers(request.headers), body, secret=secret
103+
)
104+
except gidgethub.ValidationFailure as exc:
105+
raise HTTPException(
106+
status_code=status.HTTP_401_UNAUTHORIZED,
107+
detail="Invalid webhook signature",
108+
) from exc
109+
110+
context.rebind_logger(
111+
github_event=event.event, github_delivery_id=event.delivery_id
112+
)
113+
114+
await _event_router.dispatch(event, processor=processor, context=context)
115+
116+
return {"status": "ok"}
117+
118+
119+
def _lowercase_headers(headers: object) -> dict[str, str]:
120+
"""Return a dict of request headers with lower-cased keys.
121+
122+
``gidgethub.sansio.Event.from_http`` expects a mapping that
123+
supports lower-cased keys; FastAPI's ``Request.headers`` already
124+
is case-insensitive but ``Event.from_http`` indexes via
125+
``headers["x-github-event"]`` directly, so a plain dict with
126+
pre-lower-cased keys is the safest contract.
127+
"""
128+
items = headers.items() if hasattr(headers, "items") else []
129+
return {str(k).lower(): str(v) for k, v in items}

src/docverse/main.py

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -26,6 +26,7 @@
2626
from .handlers.internal import internal_router
2727
from .handlers.orgs import orgs_router
2828
from .handlers.queue import queue_router
29+
from .handlers.webhooks import webhook_router
2930
from .services.credential_encryptor import CredentialEncryptor
3031
from .storage.user_info_store import GafaelfawrUserInfoStore
3132

@@ -138,6 +139,7 @@ async def lifespan(app: FastAPI) -> AsyncIterator[None]: # noqa: ARG001
138139
app.include_router(admin_router, prefix=config.path_prefix)
139140
app.include_router(orgs_router, prefix=config.path_prefix)
140141
app.include_router(queue_router, prefix=config.path_prefix)
142+
app.include_router(webhook_router, prefix=config.path_prefix)
141143
app.add_middleware(XForwardedMiddleware)
142144

143145
if config.slack_webhook:

src/docverse/services/dashboard_templates/__init__.py

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,7 @@
88
)
99
from .enqueue import DashboardSyncEnqueuer
1010
from .fanout import DashboardRebuildFanout
11+
from .push_processor import PushEventProcessor
1112
from .resolver import (
1213
ResolvedTemplate,
1314
ResolvedTemplateOrigin,
@@ -22,6 +23,7 @@
2223
"DashboardTemplateBindingService",
2324
"DashboardTemplateSyncError",
2425
"DashboardTemplateSyncer",
26+
"PushEventProcessor",
2527
"ResolvedTemplate",
2628
"ResolvedTemplateOrigin",
2729
"TemplateResolver",

0 commit comments

Comments
 (0)