🧭 Quick Return to Map
You are in a sub-page of Cloud_Serverless.
To reorient, go back here:
- Cloud_Serverless — scalable functions and event-driven pipelines
- WFGY Global Fix Map — main Emergency Room, 300+ structured fixes
- WFGY Problem Map 1.0 — 16 reproducible failure modes
Think of this page as a desk within a ward.
If you need the full triage and all prescriptions, return to the Emergency Room lobby.
A practical runbook to rotate API keys, signing secrets, and JWKS without breaking RAG pipelines or agent flows. Implements dual-accept windows, kid tagging, cache fences, and rollback so rotations are invisible to users and stable under load.
- Incoming 401/403 spikes during deploys or scale-out.
- Webhook verification fails after a provider rotates its signing secret.
- Mixed keys across regions or functions after a partial rollout.
- Long-lived workers holding stale secrets in memory.
- SOC or policy requires periodic rotation with audit.
- Boot ordering and safe deploy windows: Bootstrap Ordering
- Avoid state loops during changeover: Deployment Deadlock
- First-call crashes after release: Pre-Deploy Collapse
- Lock tool schemas and reduce header bloat: Prompt Injection
- Trace and prove what snippet was used: Retrieval Traceability · Contract payloads: Data Contracts
- Live probes and rollback drills: Live Monitoring for RAG · Debug Playbook
- Zero user-visible errors during the rotation window. No p95 increase.
- 0 auth failures in synthetic probes across all regions and paths.
- Old secret fully revoked within the planned TTL and verified by logs.
- For RAG: no citation loss and ΔS(question, retrieved) ≤ 0.45 on the probe set after rotation.
Always rotate with two keys in play: the current key used for signing and a next key accepted for verification. Tag every credential with a key id so validation does not guess.
Key rules
- Always include a stable kid on every token or signature.
- Verify against
active_kids = {current, next}. Sign withcurrentonly. - Flip in one step by promoting
next → currentwhen error rate is flat. - Revoke the old key after caches drain and long jobs finish.
Minimal KV schema
{
"secrets_epoch": "2025-08-27T12:00:00Z",
"active": { "kid": "k_2025_08a", "secret_ref": "sm://wfgyp/prod/k_2025_08a" },
"next": { "kid": "k_2025_09a", "secret_ref": "sm://wfgyp/prod/k_2025_09a" },
"accept_kids": ["k_2025_08a", "k_2025_09a"],
"revoke_after_s": 259200,
"notes": "roll weekly; accept_kids must be length 2 max"
}HTTP header examples
Authorization: Bearer <access_token_with_kid>
X-Key-Id: k_2025_08a
For JOSE tokens, put kid in the header. For HMAC signatures, include X-Key-Id.
T-24h
- Create
nextin your secret manager. - Update
accept_kids = {current, next}. - Push config to all regions and functions. Do not sign with next yet.
T-0
- Start signing with
current. Keep accepting both. - Warm all regions. Run probes for 10 minutes.
- Flip signer to
nextwhen probes are flat.
T+1h to T+72h
- Keep accepting both while streams drain and long jobs finish.
- Watch auth errors by
kid. - After TTL, remove
currentfromaccept_kidsand revoke it in the secret manager.
Rollback
- If errors rise after the flip, restore signer to the previous
current. Leave accept set unchanged.
- Serverless functions may cache secrets in memory. Read secrets on each invocation or attach an
If-Modified-Sincecheck with a short TTL. - Edge and CDN caches can pin config. Add
secrets_epochto cache keys or use a config version header. - For JWKS, publish both public keys and set short cache-control. Clients must refresh on
kidmiss.
Open: Edge Cache Invalidation
- Accept two signatures. Verify first with
kidif provided. If not, attempt both only inside the rotation window. - Return a 2xx with a deprecation header to prompt senders to upgrade keys.
- Log the
kidused for every valid request and alert when legacykiddrops below a threshold.
- Inject provider API keys from a secret manager at request time. Never bake into build artifacts.
- Keep a per-provider
accept_kidsand test with small traffic shadow before promoting. - If a provider rotates without
kid, schedule a short freeze window and fall back to a secondary provider if available.
Related ops: Bootstrap Ordering · Pre-Deploy Collapse
- Counters of 401/403 by
route,region, andkid. - Ratio of requests verified by current vs next.
- Time to first successful verification with
nextin each region. - Number of long-lived executions running past the flip time.
- JWKS cache hit and miss per
kid.
Fail the build when any of the following is true:
accept_kidsdoes not contain exactly two values.next.secret_refis missing or unreadable.revoke_after_sis not set or exceeds the policy maximum.- Routes that verify webhooks are not wired to read
X-Key-Idor JOSEkid.
On cold start:
1. Fetch LIMITS.json and SECRETS.json.
2. Pin {accept_kids, active.kid} to memory with TTL = 60s.
3. Expose verify(req):
- extract kid from header or token header.
- if kid in accept_kids → choose secret by kid and verify.
- else refresh secrets and retry once, then fail.
On each request:
- If now - secrets_cache_ts > 60s → refresh in the background.
- Emit metric: {route, region, kid_used, verified=true|false}.-
Only one key accepted during rollout Old workers keep signing while new verifiers reject. Add dual-accept and reduce cache TTL. Open: Deployment Deadlock
-
Webhook fails because sender changed secret before receivers Allow two secrets, prefer
kidwhen present. Add retry with jitter and short backoff. Open: Bootstrap Ordering -
Large headers during rotation Multiple auth headers overflow 8–16 KB. Collapse to one
X-Key-Idand one signature. Open: Prompt Injection -
Silent partial revocation Some regions still accept old key due to pinned caches. Force refresh and invalidate edge caches by
secrets_epoch. Open: Edge Cache Invalidation
- Blue-green probe calls succeed with both keys before the flip.
- After flip, 95 percent of traffic verifies with
nextwithin 10 minutes. - After revoke, zero successful verifications with old
kid. - RAG probe answers remain unchanged and pass ΔS ≤ 0.45.
| Tool | Link | 3-Step Setup |
|---|---|---|
| WFGY 1.0 PDF | Engine Paper | 1️⃣ Download · 2️⃣ Upload to your LLM · 3️⃣ Ask “Answer using WFGY + <your question>” |
| TXT OS (plain-text OS) | TXTOS.txt | 1️⃣ Download · 2️⃣ Paste into any LLM chat · 3️⃣ Type “hello world” — OS boots instantly |
| Layer | Page | What it’s for |
|---|---|---|
| ⭐ Proof | WFGY Recognition Map | External citations, integrations, and ecosystem proof |
| ⚙️ Engine | WFGY 1.0 | Original PDF tension engine and early logic sketch (legacy reference) |
| ⚙️ Engine | WFGY 2.0 | Production tension kernel for RAG and agent systems |
| ⚙️ Engine | WFGY 3.0 | TXT based Singularity tension engine (131 S class set) |
| 🗺️ Map | Problem Map 1.0 | Flagship 16 problem RAG failure taxonomy and fix map |
| 🗺️ Map | Problem Map 2.0 | Global Debug Card for RAG and agent pipeline diagnosis |
| 🗺️ Map | Problem Map 3.0 | Global AI troubleshooting atlas and failure pattern map |
| 🧰 App | TXT OS | .txt semantic OS with fast bootstrap |
| 🧰 App | Blah Blah Blah | Abstract and paradox Q&A built on TXT OS |
| 🧰 App | Blur Blur Blur | Text to image generation with semantic control |
| 🏡 Onboarding | Starter Village | Guided entry point for new users |
If this repository helped, starring it improves discovery so more builders can find the docs and tools.
Next page to write: ProblemMap/GlobalFixMap/Cloud_Serverless/multi_region_routing.md