Skip to content

Fable 5 classifier and billing cookbook#695

Merged
Briiick merged 2 commits into
mainfrom
anthropic/alexander/fable-5-classifier-billing-cookbook
Jun 9, 2026
Merged

Fable 5 classifier and billing cookbook#695
Briiick merged 2 commits into
mainfrom
anthropic/alexander/fable-5-classifier-billing-cookbook

Conversation

@Briiick

@Briiick Briiick commented Jun 9, 2026

Copy link
Copy Markdown
Contributor

Pull Request

Description

Type of Change

  • New cookbook
  • Bug fix (fixes an issue in existing cookbook)
  • Documentation update
  • Code quality improvement (refactoring, optimization)
  • Dependency update
  • Other (please describe):

Cookbook Checklist (if applicable)

  • Cookbook has a clear, descriptive title
  • Includes a problem statement or use case description
  • Code is well-commented and easy to follow
  • Includes expected outputs or results
  • Added entry to registry.yaml (run /add-registry <notebook-path> or add manually)

Testing

  • I have tested this cookbook/change locally
  • All cells execute without errors

Additional Context


Note: Pull requests that do not follow these guidelines or lack sufficient detail may be closed. This helps us maintain high-quality, useful cookbooks for the community. If you're unsure about any requirements, please open an issue to discuss before submitting a PR.

@Briiick Briiick merged commit 107294d into main Jun 9, 2026
@github-actions

github-actions Bot commented Jun 9, 2026

Copy link
Copy Markdown

Notebook Changes

This PR modifies the following notebooks:

📓 fable_5_fallback_billing/guide.ipynb

View diff
nbdiff /dev/null fable_5_fallback_billing/guide.ipynb (3139ce8ba1993c86f612b60f25da2fb182b410fd)
--- /dev/null  2026-06-09 17:01:47.220332
+++ fable_5_fallback_billing/guide.ipynb (3139ce8ba1993c86f612b60f25da2fb182b410fd)  (no timestamp)
## added /cells:
+  markdown cell:
+    source:
+      # Classifier Fallback & Billing
+      
+      Claude Fable 5's advanced capabilities in areas like cybersecurity, biology, and chemistry create real risk of misuse: the same skills that make it useful could help bad actors build cyberattacks or dangerous weapons. For that reason, Claude Fable 5 ships with safeguards that limit its performance in these specific areas, and automated safety checks run on every request. These checks block requests in three areas:
+      
+      - **Offensive cybersecurity techniques** — building exploits, malware, or attack tooling
+      - **Biology and life sciences** — lab methods or molecular mechanisms
+      - **Extraction of the model's [summarized thinking](https://platform.claude.com/docs/en/build-with-claude/extended-thinking#summarized-thinking)**
+      
+      These safeguards are deliberately conservative. They are tuned first for robustness, which means benign technical work sometimes triggers them. We are releasing Fable 5 with fallback to Opus 4.8 on every topic related to biology and cybersecurity, as a way of bringing you Fable's Mythos-level capability faster in all other areas. We will continue to reduce false-positive rates for Fable 5 after launch.
+      
+      **API customers should configure fallback from Claude Fable 5 to Opus 4.8** — either with the [built-in server-side fallback](https://platform.claude.com/docs/en/build-with-claude/handling-stop-reasons#server-side-fallback) feature (available on the native Claude API and Claude Platform on AWS) or with [client-side fallback](https://platform.claude.com/docs/en/build-with-claude/handling-stop-reasons#client-side-fallback) logic built on the Anthropic SDK helpers.
+      
+      We've also made billing changes so that customers don't incur token costs in most cases of Fable 5 fallback. Action is needed to adopt these changes **when you are not using the server-side fallback feature** — see [below](#4-billing-changes).
+      
+      ## What this guide covers
+      
+      1. [What a classifier block looks like](#1-what-a-classifier-block-looks-like)
+      2. [Server-side fallback (recommended)](#2-server-side-fallback-recommended)
+      3. [Streaming](#3-streaming)
+      4. [Billing changes](#4-billing-changes)
+      5. [Client-side fallback with the SDK](#5-client-side-fallback-with-the-sdk)
+      6. [Common anti-patterns](#6-common-anti-patterns)
+  code cell:
+    source:
+      %%capture
+      %pip install -U "anthropic>=0.108.0"
+  code cell:
+    source:
+      import os
+      
+      from dotenv import load_dotenv
+      
+      load_dotenv()
+      
+      PRIMARY_MODEL = "claude-fable-5"
+      FALLBACK_MODEL = "claude-opus-4-8"
+      SERVER_SIDE_FALLBACK_BETA = "server-side-fallback-2026-06-01"
+      FALLBACK_CREDIT_BETA = "fallback-credit-2026-06-01"
+      
+      # Anthropic() reads ANTHROPIC_API_KEY from the environment. Add it to a .env
+      # file (loaded above) or export it in your shell before running the live examples.
+      if not os.environ.get("ANTHROPIC_API_KEY"):
+          print(
+              "ANTHROPIC_API_KEY is not set - add it to .env or export it."
+          )
+  markdown cell:
+    source:
+      ## 1. What a classifier block looks like
+      
+      A *classifier block* is what the API returns when a request appears to violate our safeguards. The API returns `200` with `stop_reason: "refusal"` and a `stop_details` object describing the category:
+      
+      ```json
+      {
+        "stop_reason": "refusal",
+        "stop_details": {
+          "type": "refusal",
+          "category": "cyber",
+          "explanation": "This request triggered restrictions on violative cyber content and was blocked under Anthropic's Usage Policy..."
+        },
+        "content": [...]
+      }
+      ```
+      
+      Branch your logic on **`stop_reason`, not on `content` or `stop_details`**. `stop_details` is informational and can be `null`, which you should treat as a generic refusal (unspecific to the categories below).
+      
+      When present, `category` is one of `"cyber"`, `"bio"`, or `"reasoning_extraction"`. This can help you refine your fallback choice:
+      
+      | category | fires on |
+      | --- | --- |
+      | `cyber` | offensive cybersecurity content (exploits, malware, attack tooling) |
+      | `bio` | biology / life-sciences content (lab methods, molecular mechanisms) |
+      | `reasoning_extraction` | requests that attempt to extract the model's [summarized thinking](https://platform.claude.com/docs/en/build-with-claude/extended-thinking#summarized-thinking) |
+      
+      Classifier blocks are distinct from **model refusals** (the model itself declining for other policy reasons). Both surface as `stop_reason: "refusal"`, but `stop_details.category` tells you which classifier blocked you.
+      
+      > **Note:** When you are **not** using the server-side fallback feature, `stop_details` also includes a `fallback_credit_token`, which you use to bill your fallback model request as a cache read — see [Billing changes](#4-billing-changes).
+  markdown cell:
+    source:
+      ## 2. Server-side fallback (recommended)
+      
+      The Messages API can run the fallback for you. Pass `fallbacks` with `[{"model": "claude-opus-4-8"}]` and the `server-side-fallback-2026-06-01` beta header. If Fable's classifiers block the turn, the API automatically retries it with Opus 4.8 — annotated so you can tell what happened.
+      
+      The automatic fallback feature is currently supported on the **Claude API** and **Claude Platform on AWS**. Today it only supports falling back from Fable 5 to Opus 4.8; we expect to expand this.
+      
+      ```bash
+      curl https://api.anthropic.com/v1/messages \
+        -H "x-api-key: $ANTHROPIC_API_KEY" \
+        -H "anthropic-version: 2023-06-01" \
+        -H "anthropic-beta: server-side-fallback-2026-06-01" \
+        -H "content-type: application/json" \
+        -d '{
+          "model": "claude-fable-5",
+          "max_tokens": 1024,
+          "fallbacks": [
+            { "model": "claude-opus-4-8" }
+          ],
+          "messages": [
+            { "role": "user", "content": "Hello, world" }
+          ]
+        }'
+      ```
+      
+      ### When the fallback can't run
+      
+      When you've configured `fallbacks` but the API can't reach the fallback model — its rate limit is exhausted or it's overloaded — the turn still comes back as a refusal, and `stop_details.recommended_model` names the canonical model id to retry directly:
+      
+      ```json
+      {
+        "stop_reason": "refusal",
+        "stop_details": {
+          "type": "refusal",
+          "category": "cyber",
+          "recommended_model": "claude-opus-4-8"
+        }
+      }
+      ```
+      
+      `recommended_model` is populated **only** in this case (fallbacks configured *and* the fallback couldn't execute). On a plain block with no fallbacks configured, it isn't present — which is why it's absent from the basic example in [section 1](#1-what-a-classifier-block-looks-like).
+  code cell:
+    source:
+      # Fable applies extra safety filters. With a fallback chain configured, the API
+      # retries blocked turns on the next model server-side. A stop_reason of "refusal"
+      # means the whole chain refused.
+      
+      from anthropic import Anthropic
+      
+      client = Anthropic()
+      
+      
+      def chat_turn(messages, max_tokens=1024):
+          """One API call; the server handles the fallback."""
+          return client.beta.messages.create(
+              model=PRIMARY_MODEL,
+              max_tokens=max_tokens,
+              messages=messages,
+              betas=[SERVER_SIDE_FALLBACK_BETA],
+              fallbacks=[{"model": FALLBACK_MODEL}],
+          )
+  markdown cell:
+    source:
+      ### Detecting fallback (non-streaming)
+      
+      A fallback response carries a `{"type": "fallback"}` content block at each switch point, and `usage.iterations` records per-model usage. Note that a **sticky-served** turn — one routed directly to the fallback model because an earlier turn in the conversation fell back — carries *no* fallback block, because the request was routed directly and there is no boundary in `content` to mark. `usage.iterations` is the reliable way to tell whether a fallback model served the turn.
+  code cell:
+    source:
+      def fallback_hops(response):
+          """(from_model, to_model) for each hop that ran and blocked this turn."""
+          hops = []
+          for b in response.content:
+              if getattr(b, "type", None) == "fallback":
+                  d = b.model_dump() if hasattr(b, "model_dump") else dict(b)
+                  hops.append((d["from"]["model"], d["to"]["model"]))
+          return hops
+      
+      
+      def served_by_fallback(response):
+          """True whenever a fallback model served the response, INCLUDING a
+          sticky-served turn (which carries no fallback block). usage.iterations is
+          the best way to check whether a turn was served by a fallback model."""
+          iters = getattr(response.usage, "iterations", None) or []
+          return any(
+              (i.get("type") if isinstance(i, dict) else getattr(i, "type", None))
+              == "fallback_message"
+              for i in iters
+          )
+      
+      
+      response = chat_turn([{"role": "user", "content": "Hello, world"}])
+      hops = fallback_hops(response)
+      for from_model, to_model in hops:
+          print(f"[{from_model} blocked \u2014 continued on {to_model}]")
+      if not hops and served_by_fallback(response):
+          print(f"[sticky: served directly by {response.model}]")
+  markdown cell:
+    source:
+      ### Detecting fallback while streaming
+      
+      Watch for a `content_block_start` event whose block is `{"type": "fallback"}` — that marks an in-stream switch point. But for the definitive per-turn answer, check `usage.iterations` on the **final** message, exactly as you would for a non-streaming response. That check is reliable in every case, including a stream that was served directly by the fallback model, so use it as your source of truth rather than depending on the in-stream events alone.
+  code cell:
+    source:
+      with client.beta.messages.stream(
+          model=PRIMARY_MODEL,
+          max_tokens=1024,
+          messages=[{"role": "user", "content": "Hello, world"}],
+          betas=[SERVER_SIDE_FALLBACK_BETA],
+          fallbacks=[{"model": FALLBACK_MODEL}],
+      ) as stream:
+          for event in stream:
+              if (
+                  getattr(event, "type", None) == "content_block_start"
+                  and getattr(event.content_block, "type", None) == "fallback"
+              ):
+                  fb = event.content_block
+                  fb = fb.model_dump() if hasattr(fb, "model_dump") else dict(fb)
+                  print(f"[switching: {fb['from']['model']} -> {fb['to']['model']}]")
+          final = stream.get_final_message()
+      
+      # Definitive per-turn check, same as non-streaming: usage.iterations also
+      # catches a stream served directly by the fallback model (no in-stream event).
+      if served_by_fallback(final):
+          print(f"[fallback model served this stream: {final.model}]")
+  markdown cell:
+    source:
+      ### The fallback response shape
+      
+      A fallback response contains `message.model` (the model that eventually answered), a `{"type": "fallback"}` content block marking each switch point, and per-attempt usage in `usage.iterations`:
+      
+      ```json
+      {
+        "id": "msg_01Ab...",
+        "type": "message",
+        "role": "assistant",
+        "model": "claude-opus-4-8",
+        "content": [
+          { "type": "fallback", "from": { "model": "claude-fable-5" }, "to": { "model": "claude-opus-4-8" } },
+          { "type": "text", "text": "..." }
+        ],
+        "stop_reason": "end_turn",
+        "stop_details": null,
+        "usage": {
+          "input_tokens": 412, "output_tokens": 264,
+          "cache_read_input_tokens": 0, "cache_creation_input_tokens": 0,
+          "iterations": [
+            { "type": "message", "model": "claude-fable-5", "input_tokens": 408, "output_tokens": 0,
+              "cache_read_input_tokens": 0, "cache_creation_input_tokens": 0 },
+            { "type": "fallback_message", "model": "claude-opus-4-8", "input_tokens": 412, "output_tokens": 264,
+              "cache_read_input_tokens": 0, "cache_creation_input_tokens": 0 }
+          ]
+        }
+      }
+      ```
+      
+      **Per-attempt overrides.** Each fallback entry may override `max_tokens`, `thinking`, `output_config`, and `speed` for that attempt only (`output_config` and `speed` additionally require the same beta headers as the corresponding top-level fields). The request with an entry's overrides merged in must be a correctly formatted direct request to that entry's model.
+      
+      ```json
+      {
+        "model": "claude-fable-5",
+        "max_tokens": 1024,
+        "fallbacks": [
+          { "model": "claude-opus-4-8", "max_tokens": 8192, "thinking": {"type": "disabled"}, "speed": "fast" }
+        ],
+        "messages": [
+          { "role": "user", "content": "Hello, world" }
+        ]
+      }
+      ```
+      
+      **Billing.** `usage.input_tokens` is counted once for the turn. `usage.output_tokens` reflects the answer. Use `usage.iterations` if you need exact per-model attribution.
+  markdown cell:
+    source:
+      ## 3. Streaming
+      
+      In streaming, fallback is designed to work automatically. If the classifier blocks **before any output reaches you**, the stream starts with the fallback model's response. This retry is invisible and no fallback SSE event is emitted.
+      
+      If the classifier blocks **mid-stream**, the retry happens on the same stream too: the partial output is kept, a `{"type": "fallback"}` content block marks the boundary, and the fallback model continues from the partial. Nothing streamed is ever discarded.
+  code cell:
+    source:
+      def stream_turn(messages, max_tokens=1024):
+          with client.beta.messages.stream(
+              model=PRIMARY_MODEL,
+              max_tokens=max_tokens,
+              messages=messages,
+              betas=[SERVER_SIDE_FALLBACK_BETA],
+              fallbacks=[{"model": FALLBACK_MODEL}],
+          ) as stream:
+              # Nothing streamed is ever discarded: after a mid-stream block, the
+              # final message is partial + fallback block + continuation.
+              final = stream.get_final_message()
+      
+          if final.stop_reason == "refusal":
+              return final  # the whole chain refused
+          text = "".join(b.text for b in final.content if b.type == "text")
+          print(f"{final.model}: {text}")
+          return final
+  markdown cell:
+    source:
+      ## 4. Billing changes
+      
+      We've made billing changes to minimize the cost impact of fallback. These apply automatically when you use fallback and the Anthropic SDK helpers. **Action is only needed to adopt the cache-miss billing change, and only when you are not using server-side fallback.**
+      
+      **1. Input tokens are not billed on a direct classifier block** (i.e. when a request is blocked before any output tokens were returned). No action needed — this is already applied automatically to all production models, including Fable 5.
+      
+      **2. Fable 5 → Opus 4.8 fallback input tokens are billed as a cache hit.** Normally, switching to another model is billed as a [cache *write*](https://platform.claude.com/docs/en/build-with-claude/prompt-caching#how-prompt-caching-works), which is 1.25× (5-min TTL) or 2× (60-min TTL) higher than the base input-token cost. Instead, we bill these Opus tokens as if they had already been cached — i.e. as a cache *read*, which is 10% of the base input-token price.
+      
+      - **Using the server-side fallback feature:** this billing change is applied automatically.
+      - **Not using the server-side fallback feature:** see the credit-token flow below.
+      
+      ### Redeeming the fallback credit token (client-side fallback only)
+      
+      Fable requests blocked by safety classifiers include a `fallback_credit_token` in `stop_details`. The token is present **only** when the blocked request had a billable cached prefix, and is `null` otherwise.
+      
+      To redeem it:
+      
+      1. Send your subsequent Opus 4.8 request with the `anthropic-beta: fallback-credit-2026-06-01` header.
+      2. Pass the token as a top-level `fallback_credit_token` parameter.
+      3. Keep the prompt-shaping fields **identical** to the blocked request — the exact same `system`, `messages`, and `tools`.
+      
+      The prefix that was cached on the Fable request is then billed at the cache-read rate instead of cache-write. The switching cost is refunded, so the retry costs what it would have if the conversation had been on Opus all along.
+      
+      **Validity:** the token is valid only on Opus 4.8 requests that occur **within 5 minutes** of the blocked Fable 5 request and originate from the **same org and workspace**.
+      
+      **Mid-stream blocks.** If the Fable 5 request is blocked in the middle of streaming output tokens, `stop_details` also includes `fallback_has_prefill_claim: true` alongside the credit token. This means that in your subsequent Opus 4.8 request you can append that partial output as an assistant prefill and continue from where Fable stopped — something [normally not allowed](https://platform.claude.com/docs/en/test-and-evaluate/strengthen-guardrails/increase-consistency#prefill-claudes-response) in Opus 4.8 requests.
+  code cell:
+    source:
+      def redeem_credit_after_block(blocked_response, messages, max_tokens=1024):
+          """Retry a classifier-blocked Fable turn on Opus 4.8, redeeming the
+          fallback credit token so the cached prefix is billed at the cache-read
+          rate. Use this only when you are NOT using server-side fallback."""
+          details = blocked_response.stop_details
+          credit = getattr(details, "fallback_credit_token", None) if details else None
+      
+          extra = {}
+          betas = []
+          if credit is not None:  # present only when the blocked request had a cached prefix
+              betas.append(FALLBACK_CREDIT_BETA)
+              extra["fallback_credit_token"] = credit
+      
+          # The system, messages, and tools must be IDENTICAL to the blocked request.
+          return client.beta.messages.create(
+              model=FALLBACK_MODEL,
+              max_tokens=max_tokens,
+              messages=messages,
+              betas=betas or None,
+              extra_body=extra or None,
+          )
+  markdown cell:
+    source:
+      ## 5. Client-side fallback with the SDK
+      
+      Server-side fallback is available on the native Claude API and Claude Platform on AWS, but not currently on Amazon Bedrock, Vertex AI, Microsoft Foundry, or the Message Batches API. For those, or any time you want the fallback logic in your client, the Anthropic SDKs (Python, TypeScript, Go, Java, C#) ship a **refusal-fallback middleware**.
+      
+      Configure it once on a client with your fallback model list and a `BetaFallbackState`, then call `client.beta.messages` as usual. The middleware:
+      
+      - retries a `stop_reason: "refusal"` turn on the next model in the list (continuing down the chain if a fallback also refuses; if every entry refuses it surfaces the original refusal rather than raising);
+      - **sends the `fallback-credit-2026-06-01` beta header automatically on every request**, so you get the cache-read billing change from [section 4](#4-billing-changes) without managing tokens yourself;
+      - manages the `fallback` content blocks in conversation history for you;
+      - records the accepting model in `BetaFallbackState` so follow-up turns stay pinned to it.
+      
+      It is **mutually exclusive** with the server-side `fallbacks` parameter; use one or the other. (To send a server-side `fallbacks` request from an app that installs the middleware, use a separate client instance without it.)
+  code cell:
+    source:
+      from anthropic import Anthropic, BetaFallbackState, BetaRefusalFallbackMiddleware
+      
+      # Install the middleware once, with your fallback chain. No per-request betas needed.
+      client = Anthropic(
+          middleware=[BetaRefusalFallbackMiddleware([{"model": FALLBACK_MODEL}])],
+      )
+      
+      state = BetaFallbackState()  # reuse across turns to pin follow-ups to the accepting model
+      
+      # Non-streaming: a refused Fable turn is retried on Opus 4.8 transparently.
+      with state:
+          message = client.beta.messages.create(
+              model=PRIMARY_MODEL,
+              max_tokens=1024,
+              messages=[{"role": "user", "content": "Hello, Claude"}],
+          )
+      print(f"served by: {message.model}")
+      
+      # Streaming: on a refusal the middleware splices the fallback model's events
+      # onto the same open stream.
+      with (
+          state,
+          client.beta.messages.stream(
+              model=PRIMARY_MODEL,
+              max_tokens=1024,
+              messages=[{"role": "user", "content": "Hello, Claude"}],
+          ) as stream,
+      ):
+          for event in stream:
+              if event.type == "text":
+                  print(event.text, end="", flush=True)
+          final = stream.get_final_message()
+      print(f"\nserved by: {final.model}")
+  markdown cell:
+    source:
+      ## 6. Common anti-patterns
+      
+      **Set the fallback on every request, not once per account.** There is no account-level or session-level switch that enables the Opus 4.8 fallback. Each API call must include the fallback configuration. A call that doesn't turn on fallbacks returns a refusal instead of silently retrying on the fallback model.
+      
+      **Audit every code path that builds a request.** Features such as retry buttons, message regeneration, and tool-use continuations often construct their own requests, and each one can silently omit the fallback configuration. Set the fallback explicitly at every entry point. (See the [migration guide](https://platform.claude.com/docs/en/about-claude/models/migration-guide) and the [claude-api skill](https://platform.claude.com/docs/en/agents-and-tools/agent-skills/claude-api-skill) to add fallbacks across a codebase quickly.)
+      
+      **Include the fallback on subagent calls, and expect per-agent behavior.** If your application runs multiple agents in one session, every agent's calls need the fallback configuration. When a refusal occurs, only the agent that received it moves to the fallback model — the other agents stay on Fable. Do not assume one agent's fallback applies to the whole session. (The same is true of sub-agents in Claude Code: only the sub-agent that hits a refusal falls back to Opus; the rest of the session continues on Fable 5.)
+      
+      **Resubmitting the same request after a refusal just re-refuses.** The refused content is still in the conversation history, so resubmitting to the same model re-triggers the block. On `stop_reason: "refusal"`, retry on the fallback model and set a separate indicator so your router knows to stay on the fallback model for the rest of the conversation.
+      
+      **Build serving-model analytics from `usage.iterations`, not from the model you requested.** The response's `model` field is the model that actually answered, so a fallback-served turn reports Opus 4.8. Analytics recorded against the *requested* model will be wrong whenever a fallback is used. The reliable per-turn check is `usage.iterations` in the final usage record.
+      
+      **Handle streaming truncation carefully.** When you hit a classifier block mid-stream, omit any `thinking`, `tool_use`, or other blocks that appear before the fallback. A truncated `tool_use` block is unparseable JSON, and another model's `thinking` blocks will break the next call. If you continue without server-side fallback, use the `fallback_has_prefill_claim` grant from `stop_details` rather than pasting the partial response into a completed assistant turn.

Generated by nbdime

@github-actions github-actions Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Review

Recommendation: COMMENT (PR already merged — observations for follow-up)

Summary

Adds a new cookbook (fable_5_fallback_billing/guide.ipynb) documenting how to detect Claude Fable 5 safety classifier blocks and fall back to Opus 4.8 via either server-side fallback, the SDK middleware, or manual client-side handling — including the new cache-read billing treatment and credit-token flow.

Actionable Feedback (5 items)
  • guide.ipynb (in cell with from anthropic import Anthropic, BetaFallbackState, ...) — section 5 reassigns the module-level client to one wrapped with the middleware. Anything later in the notebook (or a re-run from the top) silently picks up that middleware-bound client. Consider naming this middleware_client to keep section 5 self-contained.
  • guide.ipynb (in cell with redeem_credit_after_block) — extra_body=extra or None passes None when extra is {}, which means no fallback_credit_token is sent on a refusal with no cached prefix. Worth a one-line comment clarifying that the no-credit path is intentional (Opus is just called normally), since the function name suggests redemption always happens.
  • guide.ipynb (in cell with if not os.environ.get(\"ANTHROPIC_API_KEY\"):) — when the key is missing the cell only prints a warning; the subsequent client.beta.messages.create cells will then raise. Either raise here or wrap downstream calls in a guard so the failure mode is obvious to readers.
  • registry.yaml:600-603Safeguards and Billing are new categories not present elsewhere in the registry (existing entries use Responses, Agent Patterns, Tools, etc.). If these are intentional new top-level buckets, that's fine; if not, consolidating under existing categories would keep the taxonomy tight.
  • General — all cells were committed with outputs: []. CLAUDE.md says "Keep outputs in notebooks (intentional for demonstration)." Since the example prompts (\"Hello, world\") won't actually trigger a classifier block, consider either (a) adding a contrived prompt that does trigger one and capturing the real refusal payload, or (b) noting explicitly in the markdown that the JSON shapes are illustrative.
Detailed Review

Code Quality

  • The helper functions (fallback_hops, served_by_fallback, chat_turn, stream_turn, redeem_credit_after_block) are small, focused, and well-documented. The defensive getattr / model_dump handling in fallback_hops and served_by_fallback is appropriate given the SDK may return either Pydantic models or plain dicts.
  • served_by_fallback keying off usage.iterations[*].type == \"fallback_message\" is a clean, single source of truth and the markdown correctly calls this out as the reliable check (vs. parsing in-stream events).
  • Model IDs claude-fable-5 and claude-opus-4-8 are forward-looking. CLAUDE.md says "use current Claude models," but a launch cookbook intentionally pre-stages the new IDs — flagging only so the team knows this notebook will need a model-check exemption or update once the IDs are GA.
  • Style: double quotes, 100-char-ish lines, conventional Python — matches project conventions in CLAUDE.md.

Security

  • No secrets or credentials in the diff. ANTHROPIC_API_KEY is loaded via dotenv.load_dotenv() as required by CLAUDE.md.
  • No injection or unsafe patterns; all calls are through the typed SDK.

Suggestions

  • The streaming example (stream_turn) silently discards the iterator — for event in stream is implied via context manager but only get_final_message() is called. Showing a for event in stream: body that handles text deltas would make the streaming section more illustrative.
  • The markdown in section 4 references fallback_has_prefill_claim but no code cell demonstrates the prefill flow. A short code snippet showing how to construct the Opus 4.8 follow-up with the partial-output prefill would close the loop.
  • BetaRefusalFallbackMiddleware example never demonstrates BetaFallbackState actually pinning a follow-up turn — it's instantiated and with state: is used but no second turn is sent. A two-turn example would prove the sticky behavior described in the prose.

Positive Notes

  • Excellent prose: the distinction between classifier blocks and model refusals, the usage.iterations guidance, and the anti-patterns section in §6 are the kind of subtle, easy-to-miss API behaviors that justify a cookbook existing at all.
  • Section 6 (anti-patterns) is the strongest part — calling out per-request fallback config, subagent behavior, the resubmit-loop pitfall, and analytics built off the wrong model field will save real integration time.
  • authors.yaml correctly updated with the new mikaelagrace entry; registry entry is well-formed and the cross-links inside the notebook use anchor IDs consistently.
  • Beta header constants (SERVER_SIDE_FALLBACK_BETA, FALLBACK_CREDIT_BETA) are extracted to module-level constants instead of inlined — good for readability and future updates.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants