Skip to content

Fail the build on persistent HF rate limiting#12

Draft
gary149 wants to merge 2 commits into
masterfrom
harden-prerender-fail-loud
Draft

Fail the build on persistent HF rate limiting#12
gary149 wants to merge 2 commits into
masterfrom
harden-prerender-fail-loud

Conversation

@gary149
Copy link
Copy Markdown
Contributor

@gary149 gary149 commented May 30, 2026

Problem

The site is fully prerendered (adapter-static) and deploys to GitHub Pages. Model pages fetch quant/author data from the HF API at build time. Until now, every HF failure path ended in a green build with empty data:

  • main models query 429/5xx -> res.ok false -> repos: []
  • author lookup 429 -> fetchAuthorInfo returns null -> repos filtered out
  • any thrown error -> swallowed by try/catch -> repos: []

So if the CI HF_TOKEN ever expires, or HF throttles, the deploy silently overwrites the live site with blank pages, with a green check and no alert. This is exactly the original incident (unauthenticated calls hit the rate limit and pages built empty).

Change

Add hfFetch, an authenticated HF API fetch that retries on 429/5xx (3 attempts, 1s/2s/3s backoff) and throws if throttling persists. Route the models query and both author-overview lookups through it, and remove the swallowing try/catch in load. A persistent throttle now throws out of load, which fails the prerender and vite build.

Because the deploy workflow runs npm run build before the Upload/Deploy steps (no if: always()), a failed build skips deploy entirely, so GitHub Pages keeps serving the last successful deployment. A genuine 200 with no results still builds fine, so there is no false alarm on a model that legitimately has no quants.

Behavior

build condition before after
token healthy deploys correct pages deploys correct pages (retry also absorbs transient bursts)
sustained 429/5xx, or missing token green build, empty pages deployed build fails, deploy skipped, prod unchanged
model genuinely has 0 quants passes, empty page passes, empty page (unchanged)

Verification

  • Authenticated build (HF_TOKEN=... npm run build): exits 0, all 6 pages populated (Qwen3.6-27B 6, Qwen3.6-35B-A3B 6, gemma-4-26B-A4B 5, Gemma-4-E4B 5, gpt-oss-20b 6, Step-3.7-Flash 2).
  • Forced-throttle test: temporarily made hfFetch throw; npm run build exited 1 with Error: 500 /models/... and did not write the site, confirming a throttle fails the build instead of shipping blanks. Reverted afterwards.
  • npm run check, prettier --check, and npm run test:unit all pass.

Depends on the HF_TOKEN repo secret (already configured). No workflow change needed.

gary149 added 2 commits May 30, 2026 15:22
Add hfFetch: an authenticated HF API fetch that retries on 429/5xx and
throws if throttling persists. Route the models query and both author
overview lookups through it, and drop the swallowing try/catch in load
so a throttled prerender fails the build instead of silently baking
empty pages. A failed build skips the deploy step, so prod keeps the
last successful site. Genuine 200-with-no-results still builds fine.
Copy link
Copy Markdown
Contributor

@julien-c julien-c left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

imo setting a HF_TOKEN like you've done should already solve it

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants