Skip to content

DM-54689: Wire stable GitHub IDs through sync + push#274

Open
jonathansick wants to merge 10 commits intomainfrom
tickets/DM-54689-stable-github-ids
Open

DM-54689: Wire stable GitHub IDs through sync + push#274
jonathansick wants to merge 10 commits intomainfrom
tickets/DM-54689-stable-github-ids

Conversation

@jonathansick
Copy link
Copy Markdown
Member

Summary

  • Capture GitHub's stable repo / owner / installation IDs on first successful sync and write them onto the binding row + content row, so internal matching survives a GitHub repo or org rename without operator intervention.
  • Rework PushEventProcessor to look up bindings primarily by (github_repo_id, github_ref) (using the idx_dashboard_github_template_bindings_repo_id_ref index from DM-54689: Add nullable GitHub numeric ID columns to dashboard-template foundation #240) and union with a name-based fallback restricted to un-synced bindings (github_repo_id IS NULL); dedup by binding id before the root-path filter.
  • Surface the captured github_owner_id / github_repo_id / github_installation_id on the dashboard-template binding GET response for operator debugging — public API stays name-keyed.

Validation steps

  • Create an org-default binding via PUT /docverse/orgs/{org}/dashboard-template; confirm the returned body has github_owner_id, github_repo_id, and github_installation_id set to null, and last_sync_status="pending".
  • Trigger a sync (force-sync endpoint or initial enqueue from PUT) and confirm the binding GET response now shows the three numeric IDs populated, and the dashboard_github_templates content row carries github_owner_id / github_repo_id.
  • Re-sync the same binding (force-sync) and confirm none of the three numeric IDs revert to null.
  • Send a signed push webhook whose repository.id matches a synced binding but whose repository.name is different (rename simulation); confirm a dashboard_sync job is enqueued.
  • Send a signed push webhook for an un-synced binding (matching by (owner, repo, ref) strings only); confirm a dashboard_sync job is enqueued.
  • Send a signed push webhook whose (owner, repo, ref) matches a synced binding but whose repository.id differs; confirm no dashboard_sync job is enqueued (stale-name guard).

References

Close the live-update path for GitHub-backed dashboard templates: a
new POST /docverse/webhooks/github endpoint verifies the HMAC, parses
the push payload, and enqueues one dashboard_sync job per binding
whose root_path was actually touched by the push. Truncated payloads
fall back to the GitHub compare API; mis-signed deliveries get 401;
deployments without GitHub App secrets configured see a 404.

Key decisions:
- Route HMAC validation through gidgethub.sansio.Event.from_http and
  dispatch through gidgethub.routing.Router rather than safir.github
  (which only exposes Pydantic webhook payload models — no router or
  HMAC helper, despite the issue body's hopeful "or the equivalent
  registration API" wording). The gidgethub router is held at module
  scope: registering a fresh router per request would defeat its
  registration model. Keeps the handler aligned with Times Square /
  Semaphore and avoids hand-rolling HMAC, satisfying user story 23.
- Feature-disabled (no GitHub App config) returns 404, not 503.
  Caught at handler entry by routing both
  ``get_github_webhook_secret`` and ``create_push_event_processor``
  through ``GitHubAppNotConfiguredError``. The 503 used by the
  org-admin binding endpoints (#235) would surface as a permanent
  GitHub redelivery loop on a misconfigured deployment; 404 hides
  the URL until secrets are wired and lets GitHub stop retrying.
- Added ``Factory.get_github_webhook_secret`` (public accessor) so
  the webhook handler does not reach into ``_github_webhook_secret``
  directly. Mirrors ``create_github_app_client``: validates that all
  three secrets are set and raises the same
  ``GitHubAppNotConfiguredError`` so the handler's single
  except-clause covers both paths.
- ``PushEventProcessor.process`` returns ``[]`` rather than raising
  on no-op cases (no bindings match repo/ref; no binding root_paths
  intersect changed paths; truncated payload with no before/after).
  The webhook handler's surrounding ``session.begin()`` + commit
  still runs on the empty path because logging the no-op outcome is
  cheap and the empty enqueue list is the natural success result —
  the Slack-integration loop in ``main.py`` makes the same trade.
- ``_root_path_matches`` normalises ``root_path`` exactly the way
  ``GitHubTreeFetcher._normalize_root`` does: ``"/"`` strips to ``""``
  and matches any non-empty changed-path set, while ``"templates/blue"``
  matches paths that start with ``"templates/blue/"`` or equal the
  bare directory entry. Keeps the webhook filter and the syncer's
  filter in lockstep so a binding never gets a sync enqueued for a
  path the syncer would then have ignored as out-of-scope.
- ``DashboardGitHubTemplateBindingStore.list_by_repo_ref`` is keyed
  by name (``github_owner``/``github_repo``/``github_ref``), not by
  numeric ID. Numeric-ID lookup is task #240's job; this slice
  matches by the same composite name index
  (``idx_dashboard_github_template_bindings_repo_ref``) that the PRD
  identifies as the push-lookup index for v1, with the rename-
  robustness ID-preferential lookup deferred per the PRD's "GitHub
  identity stability" section.
- The webhook handler's gidgethub router is registered with a single
  ``"push"`` callback. ``ping`` and any other event types arrive,
  signature-verify, and return 200 with no enqueue — gidgethub's
  ``dispatch`` is a fan-out to a possibly empty callback set, so
  unrelated events naturally no-op. This satisfies user story 23
  ("does not hand-roll HMAC or event parsing") and keeps GitHub's
  redelivery machinery quiet for events we deliberately ignore.
- Test fixture ``github_app_enabled`` toggles the three GitHub App
  secrets on the autouse ``context_dependency`` for the lifetime of
  one test and restores them on teardown. The default-disabled state
  preserved between tests is what makes the 404-feature-disabled
  test trivial: it just hits the URL with no fixture and asserts on
  the absence of GitHub config.

Next-iteration notes:
- Slice #240/#242 (rename-robustness) will add an ID-preferential
  binding lookup alongside ``list_by_repo_ref``. The push processor
  can then consult ``list_by_repo_id_ref`` first and fall back to
  the name-based path for un-synced bindings. The handler's plumbing
  does not need to change — only the processor's binding-lookup step.
- The handler currently registers only ``push`` events. Adding
  ``installation`` (created/deleted/suspend/unsuspend) and
  ``repository.renamed`` / ``repository.transferred`` /
  ``organization.renamed`` event handlers is task #242's scope; they
  plug into the same ``_event_router.register(...)`` decorator, so
  no further wiring change is needed. The ``create_push_event_
  processor`` factory method will likely sprout sibling
  ``create_installation_event_processor`` /
  ``create_rename_event_processor`` methods.
- Squarebot/Kafka event delivery (PRD §"Out of Scope") still goes
  through this same ``PushEventProcessor`` if it ever lands — the
  only thing the Squarebot path would change is the request entry
  point (a Kafka consumer instead of a FastAPI endpoint). The
  processor itself takes a raw payload dict, so swapping the input
  layer is a tracer-bullet exercise.
- Payload field extraction is duck-typed off the raw push dict
  (``payload["repository"]["owner"]["login"]`` etc.) rather than
  going through ``safir.github.GitHubPushEventModel``. The model
  only carries ``repository`` / ``installation`` / ``ref`` /
  ``before`` / ``after`` — it doesn't include the ``commits`` array
  that ``extract_changed_paths_from_push`` reads — so once we have
  to drop into the dict for ``commits`` and ``size`` anyway, parsing
  the rest off the dict is the simpler shape. A future refactor
  could parse a richer model and pass it through; the dict path is
  the minimum needed today.

Closes #238
Three review-driven polishes on the GitHub webhook entry point: type
``_lowercase_headers`` against ``Mapping[str, str]`` (FastAPI's
``Request.headers`` already satisfies it, so the ``hasattr`` guard
was dead code), collapse ``Factory.get_github_webhook_secret`` and
``create_push_event_processor`` into a single
``create_webhook_dispatch`` accessor so the handler's
``except GitHubAppNotConfiguredError`` wraps one call site, and drop
the redundant ``delivery_id`` field from the ``Processed push
webhook`` log line — ``github_delivery_id`` is already bound on the
context logger and the duplicate key fragmented log greppability.

Key decisions:
- ``create_webhook_dispatch`` returns the bare ``(secret, processor)``
  tuple rather than a small dataclass / namedtuple. The handler is
  the only caller and immediately destructures the pair, so giving
  the bundle a name would be ceremony for one call site. If a
  second caller materializes (e.g. a Squarebot-mediated webhook
  path), promoting to a typed container is the cheap follow-up.
- The new accessor performs its own three-secret presence check up
  front rather than letting ``create_github_app_client`` raise from
  inside the processor build. This keeps the single
  ``GitHubAppNotConfiguredError`` raise the handler observes
  cheap (no transitive imports / object construction before we
  decide the feature is off) and matches the pattern the displaced
  ``get_github_webhook_secret`` already used.
- Both replaced methods (``get_github_webhook_secret`` and
  ``create_push_event_processor``) had no other call sites, so they
  are deleted outright rather than left as deprecated shims. CLAUDE.md
  guidance is to skip backwards-compatibility hacks when the codebase
  has the only callers in scope.

Next-iteration notes:
- None. The rename-robustness handlers (#240/#242) will add sibling
  ``create_installation_event_processor`` / ``create_rename_event_
  processor`` factory methods that follow the same single-accessor
  shape — they don't need to thread through ``create_webhook_dispatch``.

Closes #266
Two test-infrastructure improvements removing direct private-attr
access from the webhook test suite (and the broader test suite that
copied the same shape). ``ContextDependency.set_github_secrets`` lets
the ``github_app_enabled`` fixture (and any future caller) toggle the
three GitHub-App secret slots without poking the singleton's private
attributes, and a new ``tests/support/arq_testing.py`` module
centralizes ``MockArqQueue._job_metadata`` reads behind
``count_jobs_by_name`` / ``get_jobs_by_name`` / ``queue_names``
helpers.

Key decisions:
- ``set_github_secrets(*, app_id, private_key, webhook_secret)`` is
  the supported test-fixture toggle, not ``initialize``. The existing
  async ``initialize`` covers the broader process-lifetime
  configuration (eight optional kwargs, including
  ``user_info_store`` / ``credential_encryptor`` / ``discovery``) and
  has the "only honor non-None values" semantics that the autouse
  ``app`` fixture needs at startup. A test fixture toggling the
  feature on for one test and off again on teardown wants the
  opposite contract: ``None`` should clear the slot. The new method
  is sync, scoped to the three GitHub-App fields, and treats every
  argument as authoritative.
- The migration scope went past the three call sites the issue
  body listed (``webhooks_github_test``,  ``queue_backend_test``,
  ``build_processing_test``) and swept up the other six tests that
  had copy-pasted the same ``[j for queue in mock._job_metadata.values()
  for j in queue.values() if j.name == NAME]`` shape (``publish_edition_test``,
  ``edition_patch_override_test``, ``edition_rollback_test``,
  ``admin_dashboard_template_test``, ``dashboard_template_test``,
  ``dashboard_enqueue_triggers_test``). Without that the acceptance
  criterion's ``grep -R "_job_metadata" tests/`` would still hit the
  copies and the centralization would be only nominally complete. The
  remaining grep matches in ``tests/storage/queue_backend_test.py``
  are calls to the *public* ``ArqQueueBackend.get_job_metadata``
  method, not private-attr reads — they are unrelated to the
  centralization the issue is asking for.
- Added a third helper ``queue_names(arq_queue) -> set[str]`` so
  ``build_processing_test``'s ``"arq:queue" not in
  mock_arq._job_metadata`` invariant (jobs landed under the
  configured queue name, not arq's default) can be expressed without
  reaching for the private attribute. The issue body listed only
  ``count_jobs_by_name`` and ``get_jobs_by_name``; the third helper
  is the cheapest way to satisfy the acceptance grep without
  inventing a workaround at one call site.
- Added a fourth helper ``register_queue(arq_queue, name)`` purely so
  ``arq_testing_test.py`` can exercise the multi-queue branch of
  ``get_jobs_by_name`` / ``count_jobs_by_name`` without needing to
  touch ``_job_metadata`` itself. ``MockArqQueue`` only auto-creates
  the slot for its ``default_queue_name``; enqueueing into any other
  queue raises ``KeyError`` until the slot is added. This keeps the
  helper's own test on the public-helper side of the centralization
  line.
- The ``github_app_enabled`` fixture still reads
  ``context_dependency._github_app_id`` (etc.) when capturing the
  prior state for restoration on teardown. The acceptance criterion
  is scoped to "private-attr *assignment*"; the issue's
  out-of-scope follow-up note covers the broader read-of-private-
  attr cleanup. Adding read accessors today would be API-creep
  without a paying caller.

Next-iteration notes:
- ``tests/conftest.py:119-125`` still pokes
  ``_superadmin_usernames`` and ``_user_info_store`` directly on the
  singleton. The issue body called this out explicitly as a parallel
  follow-up. The pattern needed is the same — a public
  ``set_admin_overrides(...)`` (or two narrow setters) on
  ``ContextDependency`` and a fixture migration. Out of scope here.
- The ``register_queue`` helper exists today only because the
  helper's own test needs to enqueue into a non-default queue. If
  a future production-side test materializes that legitimately
  needs more than one queue (rather than just inspecting an
  off-default-queue assertion), promote ``register_queue`` to a
  documented part of the helper API; otherwise it can stay as the
  small one-liner it is.
- Six adjacent test files now import from
  ``tests.support.arq_testing``. If ``safir.arq`` ever exposes a
  public iteration API, the four ``# noqa: SLF001`` lines inside
  ``arq_testing.py`` are the single place to swap the implementation
  — no caller-side change needed.

Closes #269
Both create_github_app_client and create_webhook_dispatch hand-rolled
the same three-way "is None" check on the GitHub App secrets and
raised an identical GitHubAppNotConfiguredError. Collapse the two
sites onto a single private _require_github_app_config helper that
returns the resolved (app_id, private_key, webhook_secret) tuple
typed as non-None, so a future configuration-shape change cannot
drift between the two callers.

Key decisions:
- The helper returns a bare tuple rather than a dataclass / NamedTuple.
  Both call sites destructure once and ignore the slots they don't
  need (``_, _, webhook_secret`` and ``app_id, private_key, _``). A
  named container would be ceremony for two call sites whose use of
  the unused slots is a single underscore. If a third caller
  materializes (e.g. installation/rename event processors per task
  #242) and starts paying attention to a third field, promoting to a
  named bundle is the cheap follow-up.
- The helper is private (``_require_*``) because it leaks the bare
  ``SecretStr`` instances rather than their decoded values. The two
  callers each handle ``.get_secret_value()`` themselves at the seam
  where the secret crosses into the GitHub-App / HMAC-verifier
  boundary, which keeps the helper safe to call from any future
  factory method without deciding the secret's lifetime.
- The HTTP-client check stays inline in each caller. Only
  ``create_webhook_dispatch`` and ``create_github_app_client`` need
  the shared client, and each raises a method-specific
  ``RuntimeError`` message; folding that into the helper would force
  every future caller to also need an HTTP client (e.g. an
  HMAC-only verifier doesn't), so the responsibility stays scoped.

Next-iteration notes:
- Slice #242's rename / installation event processors will need the
  same three secrets; they can call ``_require_github_app_config``
  directly. Promoting the helper to public (drop the underscore) is
  not necessary unless a non-Factory caller wants the same
  precondition check.

Closes #267
Two style fixes in push_processor.py from PR #264 review: narrow
_root_path_matches's changed_paths parameter from Iterable[str] to
Sequence[str] so the empty-check can use bool(changed_paths), which is
sound on a sequence but always True on a generator that any() would
have consumed; and collapse process()'s payload.get("repository") or
{} / repo.get("owner") or {} chain to the cleaner two-arg
.get(..., {}) form.

Key decisions:
- Sequence (not list) is the narrowest type that supports bool() and
  still accepts both call sites — both pass the materialized list[str]
  returned by _resolve_changed_paths, so no caller change is needed.
  Using list[str] would have over-constrained the contract for no
  caller benefit.
- The any(True for _ in ...) pattern is correct on the current
  list[str] callers (a list is iterable and re-iterable), but the
  Iterable[str] type signature advertised support for generators,
  where any(True for _ in g) consumes the generator and bool(g) on
  the same generator is unconditionally True. Narrowing the type +
  switching to bool() forecloses that latent bug rather than
  documenting around it.

Next-iteration notes:
- None. The two-arg .get(..., {}) form is now the convention in this
  module; future processors added under dashboard_templates/ should
  follow the same shape.

Closes #268
The tests/** ruff per-file-ignores already cover SLF001, so the
inline # noqa: SLF001 markers next to MockArqQueue._job_metadata
accesses were unused. Removing them lets ruff format reflow two
short call sites onto one line.
Replace ``_split_full_name`` and ``_repo_from_full_name`` (each
returning one half of a parsed ``owner/repo`` string) with a single
``_split_full_name_tuple`` returning ``tuple[str, str] | None``. The
two ``process()`` call sites destructure the tuple once with a
``(None, None)`` fallback so the existing ``or``-chain semantics
(prefer ``repository.owner.login``/``repository.name``, fall back to
the parsed full_name) are preserved without parsing the same
``full_name`` twice.

Key decisions:
- Destructure once at the top of the block into ``fallback_owner`` /
  ``fallback_repo`` rather than calling the helper inline twice. The
  inline form would have re-parsed the same ``"full_name"`` string
  for each half — the duplication that motivated the original two
  helpers and still motivates them under one name. The
  ``or (None, None)`` fallback keeps the call sites' ``or``-chain
  truthy-test on a plain ``str | None`` rather than an
  ``Optional[tuple]`` membership check.
- The helper signature stays ``full_name: object`` (not ``str | None``)
  to match the duck-typed payload-dict access pattern that the rest
  of ``process()`` uses on ``payload.get(...)`` /
  ``repo.get("full_name")``. Tightening to ``str | None`` would push
  ``isinstance``/cast noise into the caller for no caller benefit.
- The collapsed helper still validates ``"/"`` membership via
  ``isinstance(full_name, str) and "/" in full_name`` before
  splitting, so a missing or malformed ``full_name`` returns ``None``
  exactly as the prior pair did. ``str.split("/", 1)`` on a string
  containing ``"/"`` is guaranteed to yield two items, so the
  ``owner, repo = ...`` unpack is total.

Next-iteration notes:
- None. The module-level helper count drops from two to one and the
  call site shrinks by one line; no follow-up scaffolding is left.

Closes #270
PR #264 has unit-level coverage for both the truncation signal in
extract_changed_paths_from_push and the compare-API fallback in the
push processor, but neither path is exercised end-to-end through the
HMAC-validated webhook handler. Add two handler tests: a truncated
push (size=30, one in-payload commit limited to docs/) seeds a compare
response that does land inside the binding's root_path and asserts
exactly one dashboard_sync enqueue, and an empty-commits push
(size=0, commits=[]) asserts zero enqueues plus zero compare-API
requests via mock_github.router.calls.

Key decisions:
- Extended the existing _push_payload helper with explicit ``commits``
  and ``size`` overrides rather than constructing two ad-hoc payload
  dicts. The in-tree style (PushEventProcessor's own service tests)
  already exposes both knobs on a similar helper, and the helper
  caller still defaults to the one-commit / size=1 happy path that
  every existing test relies on. Mutating the existing keyword surface
  preserves the four older callers without churn.
- The empty-commits assertion checks ``mock_github.router.calls``
  instead of registering and asserting on a respx route handle. The
  ``seed_compare`` helper does not return its Route, and the issue's
  acceptance criterion is "compare route saw zero requests" — a
  call-list filter on ``"/compare/"`` in the URL path satisfies that
  with no signature change to ``seed_compare``. The assertion is
  scoped to compare-shaped URLs so it tolerates the always-mocked
  Repertoire-discovery and installation-token traffic that shares the
  same router.
- Truncated-push payload uses ``size=30`` (>20) with a single
  in-payload commit. Either ``size > len(commits)`` or
  ``len(commits) >= 20`` would trigger the fallback; ``size=30`` is
  the more readable signal and matches the shape used by the existing
  push_processor service tests so the two layers stay legible
  side-by-side. The in-payload commit deliberately reports only
  ``docs/index.md`` (outside ``root_path="/"`` matters not, but for a
  future restriction this asymmetry guarantees the compare call is
  what actually surfaces the template path).
- The empty-commits test uses ``root_path="/"`` (not a nested root)
  so the assertion isolates the empty-changed-path branch of
  ``_root_path_matches`` (``bool(changed_paths)`` is the gate) rather
  than the prefix branch. A nested root would also pass but would
  conflate two reasons for not enqueueing.

Next-iteration notes:
- None. The handler-level coverage now matches the service-level
  coverage for both truncation paths; rename-handler / installation-
  event tests landing under #240/#242 will follow the same pattern
  (``github_app_enabled`` fixture + ``count_jobs_by_name`` +
  ``mock_github.router.calls`` filter) without needing further
  helper plumbing.

Closes #271
Provides PushEventProcessor (#238 / PR #264) so this slice can
extend it with ID-preferential binding lookup. PR #264 is open
against main; once it merges first, this merge commit will
collapse to a no-op.
Capture GitHub's stable repo / owner / installation IDs on first
successful sync, write them to bindings + content rows, and have the
push event processor prefer ID-keyed binding lookup with a name-based
fallback for un-synced rows. Internal matching now survives a GitHub
repo or org rename without operator intervention; the public API
remains name-keyed and exposes the captured IDs in the binding GET
response for debugging.

Key decisions:
- ``InstallationAuth`` grew a required ``installation_id: int`` field
  rather than a separate ``get_installation_id`` round-trip from the
  syncer. The app client already resolves the installation id inside
  ``get_installation_auth`` and was discarding it; threading it onto
  the auth record is a no-cost change for production callers and lets
  the syncer write ``github_installation_id`` to the binding without
  doubling the GitHub round-trip count. The dataclass field is
  required (no default) so production code cannot accidentally
  construct an auth record without it; the test fixture's
  ``installation_auth`` helper sources it from the mock's seeded
  ``(owner, repo) -> id`` map (or ``default_installation_id``) so
  existing direct-construction tests keep their ergonomic shape.
- The tree fetcher gained a ``_fetch_repo`` step that calls
  ``GET /repos/{owner}/{repo}`` rather than overloading the existing
  ``commits/{ref}`` call, because the commits response does not
  include the repository's numeric ID and adding the field to its
  schema is GitHub's prerogative. Repo + owner IDs land on
  ``FetchedTree`` as required ``int`` fields rather than
  ``int | None`` because the syncer is the only consumer in this
  slice and it always wants both — leaving them ``Optional`` would
  have leaked ``None`` into the store-layer guard for no benefit.
  The ``raise_for_status`` chain catches a missing repo deterministi-
  cally, so a real GitHub anomaly (e.g. installation lost mid-sync)
  surfaces as a recorded sync failure rather than a silent no-id
  capture.
- ``DashboardGitHubTemplateBindingStore`` exposes two new methods
  rather than re-purposing ``list_by_repo_ref``:
  ``list_by_repo_id_and_ref`` for the ID-keyed primary lookup
  (backed by ``idx_dashboard_github_template_bindings_repo_id_ref``
  from #240) and ``list_unsynced_by_repo_ref`` whose
  ``github_repo_id IS NULL`` filter encodes the rename-safety
  invariant directly. Without that filter, a binding whose name
  coincidentally matches a *different* repo now sitting at the old
  string would re-sync against the wrong upstream; pushing the
  filter into the store query (vs. dedup-after-the-fact) keeps the
  invariant load-bearing in code rather than only in tests.
  ``list_by_repo_ref`` stays in the API surface even though the
  push processor no longer calls it, because the binding store is
  a public storage primitive and external callers (admin tooling,
  future debug consoles) may legitimately want the unfiltered
  name-only view.
- ``PushEventProcessor._lookup_bindings`` unions the two store calls
  with a ``seen: set[int]`` dedup guard. The two queries are
  disjoint by construction (the unsynced filter excludes IDs the
  primary lookup might find), so the dedup is defense-in-depth — but
  the ``test_process_dedupes_id_and_name_matches`` test pins the
  guard so a future loosening of either query (e.g. dropping the
  ``IS NULL`` filter "for symmetry") cannot regress to double-
  enqueueing the same ``dashboard_sync`` job. ``_coerce_int`` rejects
  ``bool`` explicitly because ``isinstance(True, int)`` is ``True``
  in Python and a malformed payload that sent ``"id": true`` would
  otherwise leak ``1`` through as a real repo id.
- The store-layer "only assign if non-None" guard on
  ``update_sync_state`` and ``upsert`` (already pinned by tests in
  #240) is what protects populated IDs from being overwritten on a
  failed re-sync. The syncer now always passes populated IDs on the
  success path, so the guard's job is to fence the *failure* path
  (where ``_record_failure`` calls ``update_sync_state`` without ID
  kwargs). The new ``test_sync_does_not_overwrite_populated_ids_
  on_resync`` test pins the contract end-to-end so a future syncer
  refactor that started passing ``None`` explicitly cannot zero a
  binding's IDs.
- ``GitHubMock.seed_tree`` now auto-seeds the
  ``GET /repos/{owner}/{repo}`` endpoint with default ``(repo_id=1,
  owner_id=1)`` so existing tree-fetcher tests that do not care
  about ID capture keep working with a single ``seed_tree(...)``
  call. Tests that round-trip captured IDs into the database pass
  ``repo_id=`` / ``owner_id=`` to ``seed_tree`` and the auto-call
  uses those instead. ``seed_repo`` is exposed separately for tests
  that bypass ``seed_tree`` (the syncer's GitHub-error fixture
  needs the repo endpoint mocked but the ref endpoint to return
  404, so it calls ``seed_repo`` then mocks ``commits/{ref}``
  directly).

Next-iteration notes:
- Slice #242 ("rename + installation webhooks") will consume the
  ``github_owner_id`` / ``github_repo_id`` columns this slice now
  populates. When a ``repository.renamed`` event arrives, the
  handler can match the binding by stable ID (the rewrite target)
  and update ``github_owner`` / ``github_repo`` strings without
  re-syncing content. This slice intentionally does not register
  any rename-event handlers — name-keyed pushes still match
  un-synced bindings and ID-keyed pushes still match synced ones,
  so the system is functional without rename handlers; they're a
  staleness-fix, not a correctness requirement.
- The blocker merge of ``tickets/DM-54689-github-webhook`` (PR
  #264) into this branch is a no-op once #264 lands on main first.
  GitHub auto-merge will flatten the merge commit. If #264 is
  rebased instead of merged cleanly, this branch needs a manual
  re-merge.
- ``InstallationAuth.installation_id`` is now required, but the
  GitHub App secrets remain optional on ``Factory`` /
  ``WorkerFactoryBuilder``. The optionality is a feature-flag for
  the dashboard-template-from-github feature; once the feature
  graduates from optional, the three GitHub App fields can be
  tightened to non-Optional and ``startup`` raised earlier — the
  ``InstallationAuth`` change has no bearing on that decision.
- The ``DashboardTemplateBinding`` client model now exposes the
  three numeric ID fields as informational. If a future client
  wants to assert on rename-robustness from the outside, it can
  observe ``github_repo_id`` flipping from ``null`` to populated
  after the first successful sync — no new endpoint needed.

Closes #241
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

DM-54689: Capture GitHub numeric IDs on sync + match by ID in push processor

1 participant