Fail the build on persistent HF rate limiting by gary149 · Pull Request #12 · ggml-org/llama.pages

gary149 · 2026-05-30T13:24:07Z

Problem

The site is fully prerendered (adapter-static) and deploys to GitHub Pages. Model pages fetch quant/author data from the HF API at build time. Until now, every HF failure path ended in a green build with empty data:

main models query 429/5xx -> res.ok false -> repos: []
author lookup 429 -> fetchAuthorInfo returns null -> repos filtered out
any thrown error -> swallowed by try/catch -> repos: []

So if the CI HF_TOKEN ever expires, or HF throttles, the deploy silently overwrites the live site with blank pages, with a green check and no alert. This is exactly the original incident (unauthenticated calls hit the rate limit and pages built empty).

Change

Add hfFetch, an authenticated HF API fetch that retries on 429/5xx (3 attempts, 1s/2s/3s backoff) and throws if throttling persists. Route the models query and both author-overview lookups through it, and remove the swallowing try/catch in load. A persistent throttle now throws out of load, which fails the prerender and vite build.

Because the deploy workflow runs npm run build before the Upload/Deploy steps (no if: always()), a failed build skips deploy entirely, so GitHub Pages keeps serving the last successful deployment. A genuine 200 with no results still builds fine, so there is no false alarm on a model that legitimately has no quants.

Behavior

build condition	before	after
token healthy	deploys correct pages	deploys correct pages (retry also absorbs transient bursts)
sustained 429/5xx, or missing token	green build, empty pages deployed	build fails, deploy skipped, prod unchanged
model genuinely has 0 quants	passes, empty page	passes, empty page (unchanged)

Verification

Authenticated build (HF_TOKEN=... npm run build): exits 0, all 6 pages populated (Qwen3.6-27B 6, Qwen3.6-35B-A3B 6, gemma-4-26B-A4B 5, Gemma-4-E4B 5, gpt-oss-20b 6, Step-3.7-Flash 2).
Forced-throttle test: temporarily made hfFetch throw; npm run build exited 1 with Error: 500 /models/... and did not write the site, confirming a throttle fails the build instead of shipping blanks. Reverted afterwards.
npm run check, prettier --check, and npm run test:unit all pass.

Depends on the HF_TOKEN repo secret (already configured). No workflow change needed.

Add hfFetch: an authenticated HF API fetch that retries on 429/5xx and throws if throttling persists. Route the models query and both author overview lookups through it, and drop the swallowing try/catch in load so a throttled prerender fails the build instead of silently baking empty pages. A failed build skips the deploy step, so prod keeps the last successful site. Genuine 200-with-no-results still builds fine.

julien-c

imo setting a HF_TOKEN like you've done should already solve it

gary149 added 2 commits May 30, 2026 15:22

Shorten hfFetch comment

ed9464b

julien-c reviewed May 30, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fail the build on persistent HF rate limiting#12

Fail the build on persistent HF rate limiting#12
gary149 wants to merge 2 commits into
masterfrom
harden-prerender-fail-loud

gary149 commented May 30, 2026

Uh oh!

julien-c left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

gary149 commented May 30, 2026

Problem

Change

Behavior

Verification

Uh oh!

julien-c left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants