chore: sync from agents-private#3224
Conversation
* fix(open-knowledge): unblock bridge oversized PRs and mirror PR creation Three distinct sync-pipeline failures landed in the past day on the OK side. All block real CI runs. 1. Public PR bridge fails on oversized diffs. GitHub's diff endpoint hard-caps at 20,000 lines. inkeep/open-knowledge#377 tripped this with `GET /repos/.../pulls/377 failed (406): diff exceeded the maximum number of lines (20000)`. Same code path would break the agents and agents-optional-local-dev bridges for any sufficiently long-running branch. Fix: detect the size error in `isDiffTooLargeError` and fall back to `git fetch` + `git diff` inside a throwaway bare repo. 3-dot diff matches the API's `.diff` semantics; blob SHAs remain content-identical to agents-private (Copybara 1:1 mirroring), so `git apply --3way` resolves them locally with no apply-path change. 2. Pre-cutover branches re-introduce internal-only paths. Old `inkeep/open-knowledge` branches predate the cutover and carry `specs/`, `reports/`, `.codex/`, etc. that the public mirror no longer exports. Bridging them back applied those paths under `public/open-knowledge/` against the source-of-truth copies on agents-private. Fix: bridge reads `BRIDGE_EXCLUDED_PATHS` (JSON array of public- repo path prefixes) from the workflow env and drops matching diff sections in `filterDiffByPath` before applying. Open Knowledge workflow sets the canonical pre-cutover list. Other bridges default to no filtering (backward-compatible). 3. OK Copybara sync branch can't be PR'd: no common history with main. Copybara's OK migrate uses `--init-history`, which seeds `copybara/sync` from a detached root. GitHub refuses `gh pr create` with `no history in common with main`. Surfaced immediately after #334 fixed the comment-stripping pipeline. Fix: a "Reseat open knowledge sync branch on main" step runs after Copybara migrate. It clones inkeep/open-knowledge shallowly, reads the tree at copybara/sync, replays as a single commit on top of main (preserving Copybara's commit message and GitOrigin-RevId footer), and force-pushes. Skipped when the tree already matches main (deletes the branch — nothing to PR). Bridge scripts kept code-shape aligned across the three siblings. Quote style and agents-optional-local-dev's reconcileMonorepoPatches divergence preserved per the existing convention. Runbook entries added for all three failure modes. * fix(bridge): pre-fetch public PR refs to fix --3way missing-blob errors Diligence on real CI failures (#411, #396, #374) showed the bridge's dominant failure mode wasn't oversized diffs but a different one: error: repository lacks the necessary blob to perform 3-way merge. error: patch failed: public/open-knowledge/THIRD_PARTY_NOTICES.md:3321 error: public/open-knowledge/bun.lock: patch does not apply Root cause: `git apply --index --3way` reads the patch's `index` lines (blob SHAs from the public repo's PR-base side) and looks them up in agents-private's object store. When public-mirror-sync is stalled, public main has blobs that haven't yet been mirrored to agents-private. The patch references those blobs; the local store doesn't have them; --3way fails. This is downstream of any mirror-sync failure — the bridge becomes broken whenever sync stalls. Mirror sync had been stalled for hours before the fix in this PR landed, and several PRs piled up on this exact error. Fix: syncPublicPr now adds a temporary `bridge-public-<num>` remote pointing at the public repo and fetches `+refs/pull/<num>/head` and `+refs/heads/<base>` into agents-private's clone before applying the patch. `--3way` then resolves the patch's base blobs locally regardless of mirror staleness. The fetch is torn down in a `finally` so subsequent runs (or retries) start clean. The same fetched refs also serve as the source for the local-git-diff fallback (replaces the temp-bare-repo approach in the previous commit — simpler and shares blobs with the apply step). Validated against fixture tests in /tmp/ok-diligence/bridge-test*.sh: - v2 reproduces the missing-blob error without the fetch (matches #411 logs verbatim) and confirms it's gone with the fetch. - v3 confirms in-sync new-file scenarios still apply cleanly with no regression, and the local-git-diff fallback against the fetched refs produces the expected 3-dot diff. Applied identically to all three sibling bridge scripts (OK overlay, agents, agents-optional-local-dev). Runbook entry added under "Open Knowledge subtree failures". * ci(open-knowledge): add agents-private PR validation workflow Every other subtree has a *-validation.yml on agents-private. OK was missed during the monorepo migration, so OK-only PRs merged without lint, typecheck, unit/integration/conversion/fidelity, or Playwright signal until Copybara mirrored to inkeep/open-knowledge post-merge. Mirrors public/open-knowledge/.github/workflows/ci.yml 1:1: lint job + 5-task test matrix + Playwright on ubuntu-64gb, path-scoped to public/open-knowledge/**. Public-repo ci.yml keeps running unchanged on push-to-main and bridged PRs (additive parity, not a move). Runbook entry added under "Open Knowledge subtree failures". * fix(bridge): address claude+pullfrog review on PR #335 Four findings, all addressed: 1. (Minor) `gh api` branch check in the reseat step swallowed auth/ network errors as "branch not found", silently skipping the reseat when a real failure (401/403/5xx) deserves a loud red workflow. Now captures stdout+stderr separately and explicitly distinguishes HTTP 404 (expected, exit 0) from anything else (`::error::` and exit 1). 2. (Consider) Token leak via `run()` error fallback. The bridge's error wrapper appends `args.join(' ')` when stderr+stdout are both empty; one of the args is the public-repo URL with the x-access-token credential. Added `sanitizeErrorMessage` that redacts `https://x-access-token:.+@` to `https://x-access-token: ***@` in every error path (stderr, stdout, fallback). Especially important for the agents-optional-local-dev variant, which posts `error.message` verbatim into a public-facing GitHub PR comment on patch-apply failure. 3. (Consider) Cleanup trap for the reseat step's mktemp dir. Added `trap 'rm -rf -- "$WORK_DIR"' EXIT INT TERM`. Cosmetic on GitHub-hosted runners (filesystem destroyed post-job) but correct discipline for self-hosted parity. 4. (Pending observations from pullfrog) a. `isDiffTooLargeError` regex was too broad — bare `too_large` could match unrelated 422s (PR body length validation, etc.). Tightened to `diff exceeded the maximum number of lines | diff is too large | diff_too_large` only. Validated that `too_long` (PR body) and bare `too_large` no longer match. b. `--depth=2000` could be insufficient for very long-running branches whose merge-base lies deeper. Replaced with a 2-step ladder (10000 then 50000) so loud "no merge base" errors become loud retries with deeper history before giving up. All three sibling bridge scripts kept code-shape aligned. Single- quote vs double-quote style preserved per the existing convention. * fix(bridge): address claude review on PR #335 (round 2) Four findings from the latest review, all addressed: 1. (Minor) `execFileSync` default `maxBuffer` of 1 MB would truncate the local-git-diff fallback — the very path designed for >20,000 line PRs, which routinely produce 1.6+ MB of diff output. Bumped `fetchPullRequestDiffViaLocalGit` to `maxBuffer: 50 * 1024 * 1024`. Real bug: without this, the fallback would throw `ERR_CHILD_PROCESS_STDIO_MAXBUFFER` for almost every PR that reaches it. 2. (Minor) `public-open-knowledge-validation.yml` was added on this branch but missing from CI.md's "PR validation (required checks)" table and the "Private-only workflows" table. Added rows to both — CI.md is the canonical workflow map and the omission would have left engineers (and agents) thinking OK had no agents-private CI coverage. 3. (Consider) Cascading failure: if the public-PR-refs fetch warning step failed, then the API also rejected the PR as too large, the local-git-diff fallback would try to diff against refs that were never fetched and produce an opaque "unknown revision" error with no breadcrumbs back to the original fetch failure. Now syncPublicPr tracks `refsFetched`, threads it into `fetchPullRequestDiff`, and the fallback path throws a clear error pointing at the earlier warning when it can't proceed. 4. (While You're Here) The OK and agents bridge copies emitted a generic "Patch application failed. The diff could not be applied cleanly." comment on apply failures, while agents-optional-local-dev included `error.message` in a code block. Aligned all three to the more useful form. Safe because `run()` sanitizes the x-access-token URL out of error messages, so the public-facing comment can never leak the credential. The five "Pending" items from the review (gh-api branch check, token in run() error, temp-dir cleanup, too_large regex breadth, depth=2000) were already addressed in commit b6d48f1bd; the bot was reviewing the prior commit. They should clear on the next bot pass. All three sibling bridge scripts kept code-shape aligned. Quote style preserved per existing convention. GitOrigin-RevId: 934ff381fe7ecb8245acaeb05ec80f8e5f72e787
|
There was a problem hiding this comment.
Automated approval from agents-private public-mirror-sync (run: https://github.com/inkeep/agents-private/actions/runs/25233856378). Source of truth is the monorepo; direct edits on inkeep/agents are overwritten on next sync.
Automated sync from agents-private via Copybara mirror.