Skip to content

Harden the package-publish pipeline against stuck/corrupted PRs#11423

Open
borisschlosser wants to merge 2 commits into
masterfrom
boris/harden-publish-pipeline
Open

Harden the package-publish pipeline against stuck/corrupted PRs#11423
borisschlosser wants to merge 2 commits into
masterfrom
boris/harden-publish-pipeline

Conversation

@borisschlosser

Copy link
Copy Markdown
Contributor

Follow-up to #11410. After megaport@1.10.1 stacked 24 failing publish PRs (and ~25 auto-filed failure issues) over two weeks from a single malformed Hugo shortcode delimiter, these three independent changes harden the pipeline so it can't happen the same way again. Each is small and self-contained.

P1 — One PR per provider instead of one per run

new-provider-version-pr embedded ${{ github.run_id }} in the branch name, so every publish run opened a fresh branch + PR. A persistently failing publish therefore piled up indefinitely (24 megaport PRs).

Switch to a stable per-provider branch (<provider>/publish-metadata). peter-evans/create-pull-request then updates the existing PR each run, so there's at most one open publish PR per provider and a fixed/newer render auto-replaces a broken one.

P2 — Fail fast on malformed delimiters, everywhere

A single {{ < / {{ % (stray space) aborts the entire Hugo site build with a cryptic shortcode error — and the existing front-matter linter skips the auto-generated provider _index.md files (those carrying the # WARNING: this file was fetched from header), which are exactly the ones that get corrupted.

New scripts/lint/check-shortcode-delimiters.js scans every content markdown file for malformed opening delimiters and fails with a clear file:line message. Wired into make lint-markdown, so the existing Lint Markdown CI job runs it on every PR. The matcher mirrors resourcedocsgen's sanitizeShortcodeDelimiters and is unit-tested with node:test (6 cases, incl. a Go-template {{ .Value }} negative case).

P3 — Avoid stale CDN edges on fetch

The original corruption reached us via a stale CloudFront edge serving a pre-re-render object even after the origin was fixed. readRemoteFile now sends Cache-Control: no-cache / Pragma: no-cache so intermediary caches revalidate. (CloudFront may ignore client no-cache depending on config, so this is a low-cost mitigation on top of the #11410 guard, not a guarantee.)

Tests

  • cmd: TestReadRemoteFileSendsNoCacheHeaders (httptest asserts the request headers) + existing guard tests — full resourcedocsgen suite green; go vet/gofmt clean.
  • scripts/lint: node --test check-shortcode-delimiters.test.js → 6/6; make lint-shortcode-delimiters green on the current tree (819 files, 0 malformed) and exits 1 on an injected malformed file.

🤖 Generated with Claude Code

Follow-up to #11410 (which sanitizes malformed Hugo shortcode delimiters on
fetch). Three independent hardening changes:

P1: One PR per provider instead of one per run.
  new-provider-version-pr used a per-run branch name embedding ${github.run_id},
  so every publish run opened a brand-new branch and PR. Failing publishes piled
  up (e.g. 24 stacked megaport PRs over two weeks). Use a stable per-provider
  branch so peter-evans/create-pull-request UPDATES the existing PR; a fixed or
  failed render is auto-replaced by the next run.

P2: Fail fast on malformed delimiters, everywhere.
  scripts/lint/check-shortcode-delimiters.js scans every content markdown file
  (including the auto-generated provider _index.md pages, which the front-matter
  linter skips) for "{{ <" / "{{ %" and fails with a clear file:line error instead
  of a cryptic whole-site Hugo build abort. Wired into `make lint-markdown` so the
  existing "Lint Markdown" CI job runs it. Unit-tested with node:test.

P3: Avoid stale CDN edges on fetch.
  readRemoteFile now sends Cache-Control/Pragma: no-cache so intermediary caches
  revalidate. The original corruption reached us via a stale CloudFront edge that
  served a pre-re-render object; this reduces the chance of fetching stale docs.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@claude

claude Bot commented Jun 17, 2026

Copy link
Copy Markdown

Registry Review — PR #11423

Thanks for the thorough write-up and tests. This is a well-scoped, defense-in-depth follow-up to #11410, and all three changes are independently sound. Nothing blocking — a few minor observations below.

What looks good

  • P2 lint script (scripts/lint/check-shortcode-delimiters.js) is focused and well-commented. The malformed-delimiter regex correctly excludes Go-template prose like {{ .Value }}, and the negative test case guards against regressions. Wiring it as a prerequisite of lint-markdown (Makefile:15) means it runs in the existing CI job for free — good reuse.
  • P3 no-cache headers (tools/resourcedocsgen/cmd/metadata.go:428-432) are correctly placed before AddGitHubAuthHeaders, and the httptest-based assertion in cache_test.go is the right way to verify request-side headers. The comment honestly notes CloudFront may ignore client no-cache, framing it as a low-cost mitigation rather than a guarantee.
  • P1 stable branch (action.yml:33) is the right fix for the "one PR per run" pileup, and relying on peter-evans/create-pull-request's update-in-place behavior is idiomatic.

Observations (non-blocking)

1. Automerge is only (re)enabled on created, not updated (action.yml:34). With the old run-id branch, every run produced pull-request-operation == created, so automerge was re-enabled each time. With the stable branch, only the first run for a provider hits the created path; subsequent re-renders produce updated and skip the Set Automerge step. This appears to work out because automerge, once enabled on a PR, persists across branch force-pushes — so a fixed render pushed to an already-open PR will still auto-merge when checks pass. Worth confirming that assumption holds for your repo merge settings, since the "fixed render auto-replaces a broken one" guarantee depends on it. (No change needed if confirmed.)

2. collectMarkdownFiles will throw if the content dir is absent. check-shortcode-delimiters.js calls fs.readdirSync(contentDir, ...) with no guard. If themes/default/content is ever missing (e.g. an unusual CI checkout), the script exits non-zero with a raw stack trace rather than a clear message. In practice the dir always exists, so this is purely cosmetic robustness — an early if (!fs.existsSync(dir)) return out; would make the failure mode clearer.

3. BUILD-AND-DEPLOY.md currency. This PR adds a new make lint-shortcode-delimiters target and changes the lint-markdown dependency chain. Per the review criteria, Makefile changes should prompt a check of BUILD-AND-DEPLOY.md — consider whether the lint section there should mention the new target (likely a one-line addition, if at all).

4. Trivial: test naming. In check-shortcode-delimiters.test.js, the case named "flags a malformed closing delimiter" actually feeds a malformed opening delimiter on a closing tag. The assertion is correct; only the label is slightly off.


Overall this is clean, well-tested, and ready once the automerge persistence assumption (1) is confirmed. Mention me (@claude) if you'd like another pass or want me to draft any of the above changes.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@github-actions

Copy link
Copy Markdown
Contributor

Your site preview for commit 99916e8 is ready! 🎉

http://registry--origin-pr-11423-99916e83.s3-website.us-west-2.amazonaws.com/registry.

@borisschlosser borisschlosser requested a review from a team June 17, 2026 11:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant