[codex] runtime-only mode-registry unification by jxnl · Pull Request #2061 · 567-labs/instructor

jxnl · 2026-02-06T17:14:50Z

This PR contains the runtime/provider implementation changes from the mode-registry unification effort, with docs/examples moved to a separate PR for simpler review.

Companion PR

Docs/examples split: [codex] docs+examples split from mode-registry refactor #2060

Why

The original branch combined large runtime refactors with extensive docs churn.
Splitting docs allows focused review on behavior changes, compatibility, and test coverage.

What is included

Registry-based mode/provider routing refactors across instructor/core/**, instructor/v2/**, and provider clients/handlers
from_provider and provider resolution changes in runtime code
non-doc test reorganizations and additions outside tests/docs/**
dependency/runtime metadata changes associated with runtime paths

What is intentionally excluded

docs/**, examples/**, and docs tooling/test grouping updates (moved to companion PR)
plan documents

Validation

This branch was produced by removing docs/examples scope from the original PR branch and verifying no docs/examples paths remain in the diff against main.
Additional full test execution was not run during the split step.

Note

High Risk
Touches core request/response patching, retry behavior, and provider client construction across many integrations, so regressions could affect structured output parsing and mode compatibility at runtime.

Overview
Shifts the runtime structured-output pipeline to a registry-driven v2 routing model: core.patch, core.retry, and core.client.from_openai now delegate to v2 registry handlers (with provider-aware mode normalization/validation) instead of legacy per-mode branching.

Updates from_provider to preferentially construct v2 provider clients (OpenAI-compatible providers like Databricks/DeepSeek/OpenRouter, plus Anthropic/GenAI/Cohere/Mistral/Bedrock/VertexAI/xAI/etc.), standardizing default modes (often Mode.TOOLS/Mode.MD_JSON) and adding a new Vertex AI initialization path. Public exports are adjusted to introduce ResponseSchema/response_schema and reduce optional provider re-exports, while streaming DSL helpers are refactored to accept injected stream extractors/parsers rather than embedding provider/mode-specific chunk logic.

Also includes CI/ops and typing cleanups: GitHub workflows switch type checking from pyright to ty, add MISTRAL_API_KEY to CI env, tighten batch/cache typing, and remove obsolete agent/cursor guidance files.

^{Written by Cursor Bugbot for commit 0a65bcd. Configure here.}

- Add ModeRegistry for O(1) handler lookups via (Provider, Mode) tuples - Add ModeHandler base class and protocol interfaces - Add patch_v2() function for unified provider patching - Add registry-based retry logic with handler integration - Add exception hierarchy (RegistryError, ValidationContextError) - Add mode normalization with deprecation warnings - Add @register_mode_handler decorator for handler registration - Add registry unit tests This PR was written by [Cursor](https://cursor.com)

- Remove debug logging blocks in retry.py that wrote to hardcoded local path - Fix GENAI_STRUCTURED_OUTPUTS enum value to avoid alias collision - Fix sync retry to extract stream parameter from kwargs like async version

- Add docs/concepts/mode-migration.md explaining legacy mode deprecation - Add tests/v2/test_mode_normalization.py for mode normalization logic - Update mkdocs.yml with mode migration guide link - Tests skip gracefully when handlers not yet registered This PR was written by [Cursor](https://cursor.com)

- Fix tautological test assertion to verify handler exists - Use provider-specific deprecated modes in warning test

- Add instructor/v2/providers/anthropic/ with handlers for TOOLS, JSON_SCHEMA, PARALLEL_TOOLS, ANTHROPIC_REASONING_TOOLS modes - Add instructor/v2/providers/openai/ with handlers for TOOLS, JSON_SCHEMA, MD_JSON, PARALLEL_TOOLS, RESPONSES_TOOLS modes - Update instructor/v2/__init__.py with from_anthropic and from_openai exports - Update instructor/auto_client.py with v2 routing integration - Add tests/v2/test_provider_modes.py for integration tests - Add tests/v2/test_handlers_parametrized.py for unit tests - Add tests/v2/test_openai_streaming.py for streaming tests This PR was written by [Cursor](https://cursor.com)

- Remove debug logging in auto_client.py for Cohere client - Fix google provider to use v1 from_genai (v2 not available yet) - Add empty check for text_blocks in Anthropic MD_JSON handler - Add None check for tool_calls in OpenAI PARALLEL_TOOLS handler

- Add instructor/v2/providers/genai/ with handlers for TOOLS, JSON modes - Add instructor/v2/providers/cohere/ with handlers for TOOLS, JSON_SCHEMA, MD_JSON modes - Add instructor/v2/providers/mistral/ with handlers for TOOLS, JSON_SCHEMA, MD_JSON modes - Update instructor/v2/__init__.py with from_genai, from_cohere, from_mistral exports - Add tests/v2/test_genai_integration.py - Add tests/v2/test_cohere_handlers.py - Add tests/v2/test_mistral_client.py and test_mistral_handlers.py This PR was written by [Cursor](https://cursor.com)

- Remove debug logging in Cohere client - Fix shallow copy mutation in Cohere handlers (copy messages list) - Add empty list check in Mistral MD_JSON handler

…iter, Bedrock) - Add instructor/v2/providers/xai/ with handlers for TOOLS, JSON_SCHEMA, MD_JSON modes - Add instructor/v2/providers/groq/ with handlers for TOOLS, MD_JSON modes - Add instructor/v2/providers/fireworks/ with handlers for TOOLS, MD_JSON modes - Add instructor/v2/providers/cerebras/ with handlers for TOOLS, MD_JSON modes - Add instructor/v2/providers/writer/ with handlers for TOOLS, MD_JSON modes - Add instructor/v2/providers/bedrock/ with handlers for TOOLS, MD_JSON modes - Update instructor/v2/__init__.py with all provider exports - Add provider-specific test files for all 6 providers All 11 v2 providers are now implemented. This PR was written by [Cursor](https://cursor.com)

- Fix Bedrock MD_JSON handler to return early for None response_model - Fix Fireworks async streaming to await the coroutine - Fix xAI async streaming filter to only check for tool_calls - Add fallback error handling for xAI sync streaming - Add list content case to xAI MD_JSON handler - Add truncated output detection to Writer handlers

Test reorganization: - Move cache tests to tests/cache/ - Move core tests to tests/core/ (exceptions, patch, retry, schema) - Move multimodal tests to tests/multimodal/ - Move processing tests to tests/processing/ - Move provider tests to tests/providers/ - Remove obsolete/duplicate test files Unified test infrastructure: - Add tests/v2/test_client_unified.py - Parametrized tests for all provider clients - Add tests/v2/test_handler_registration_unified.py - Handler registration validation - Add tests/v2/test_routing.py - Provider routing tests - Add tests/v2/README.md - Test documentation - Add tests/v2/UNIFICATION_OPPORTUNITIES.md - Future consolidation notes This PR was written by [Cursor](https://cursor.com)

The test expects a deprecation warning that hasn't been added to v1 from_anthropic yet

Documentation: - Add instructor/v2/README.md with comprehensive architecture documentation - Update docs/modes-comparison.md with v2 mode mappings - Update docs/integrations/anthropic.md, genai.md, google.md, bedrock.md - Update docs/concepts/from_provider.md with v2 routing - Update docs/api.md with v2 exports - Update CLAUDE.md with v2 development notes Code cleanup: - Update pyproject.toml version - Update .github/workflows/test.yml - Minor fixes in instructor/core/, dsl/, processing/, providers/ - Remove obsolete plan/seo_plan.md This PR was written by [Cursor](https://cursor.com)

Remove debug logging blocks that write to hardcoded local path

Fix non-deterministic test collection by sorting providers before parameterization

Remove debug logging blocks that write to hardcoded local path in prepare_request and parse_response methods

- Pass stream_extractor to Partial/Iterable streaming helpers (keep legacy fallback) - Remove stray no-op import in vertexai shim - Restore useful type info in openai_schema TypeError

Remove redundant handler files and register these providers directly via OPENAI_COMPAT_PROVIDERS list. These providers use OpenAI-compatible APIs, so they can share the same handler implementations. - Add GROQ, FIREWORKS, CEREBRAS to OPENAI_COMPAT_PROVIDERS - Update client imports to use OpenAI handlers module - Update _HANDLER_SPECS to point to OpenAI handlers - Remove redundant handler files (groq, fireworks, cerebras) - Update registry to remove deleted handler modules

- Introduced `_parse_with_registry` to centralize parsing logic and handle deprecation warnings. - Updated `ResponseSchema` methods for parsing Anthropic tools, JSON, OpenAI functions, and tools to use the new method. - Deprecated `ResponseSchema.parse_*` methods in favor of `process_response` and `ResponseSchema.from_response` with core modes. - Updated documentation to reflect the deprecation of legacy `ResponseSchema.parse_*` helpers.

…egistry # Conflicts: # pyproject.toml # uv.lock

…dule

… add tests (#2039)

Move v2 provider helper logic into handler modules so request/parse behavior lives next to each mode handler and remove redundant utils modules.

cloudflare-workers-and-pages · 2026-02-06T17:14:57Z

Deploying with Cloudflare Workers

The latest updates on your project. Learn more about integrating Git with Workers.

Status	Name	Latest Commit	Updated (UTC)
❌ Deployment failed View logs	instructor	`c8689a0`	Feb 06 2026, 05:21 PM

# Conflicts: # instructor/providers/xai/client.py # instructor/providers/xai/utils.py # requirements.txt

jxnl and others added 30 commits January 18, 2026 13:40

fix(v2): address PR review comments

64b7a3e

- Remove debug logging blocks in retry.py that wrote to hardcoded local path - Fix GENAI_STRUCTURED_OUTPUTS enum value to avoid alias collision - Fix sync retry to extract stream parameter from kwargs like async version

fix(v2): address PR review comments

9bb9d7b

- Fix tautological test assertion to verify handler exists - Use provider-specific deprecated modes in warning test

fix(v2): address PR review comments

bf8f011

- Remove debug logging in Cohere client - Fix shallow copy mutation in Cohere handlers (copy messages list) - Add empty list check in Mistral MD_JSON handler

fix(v2): skip deprecation warning test until implemented

09cfa9a

The test expects a deprecation warning that hasn't been added to v1 from_anthropic yet

fix(v2): remove debug logging in iterable.py

8ebb42e

Remove debug logging blocks that write to hardcoded local path

fix(tests): sort providers for deterministic pytest-xdist ordering

5cf915b

Fix non-deterministic test collection by sorting providers before parameterization

fix(v2): remove remaining debug logging in OpenAI handlers

90a7002

Remove debug logging blocks that write to hardcoded local path in prepare_request and parse_response methods

feat(v2): unify mode registry and add protocol support

742644f

Fix v2 provider handling and compatibility

21b823f

fix(processing): align streaming + improve schema errors

c30854f

- Pass stream_extractor to Partial/Iterable streaming helpers (keep legacy fallback) - Remove stray no-op import in vertexai shim - Restore useful type info in openai_schema TypeError

docs: gate example execution and format code blocks (#2028)

dc7994d

merge(main): sync main into feat/pr7-unify-mode-registry

e7ec94e

refactor(v2): drop deprecated modes

ec94cf9

fix: normalize deprecated modes for v2 registry

040db2d

chore: format response streaming handlers

8bab6e5

feat(v2): switch from_provider to v2 preview

efd6c14

docs: add citation section to navigation in mkdocs.yml

1c1a22e

Merge remote-tracking branch 'origin/main' into feat/pr7-unify-mode-r…

b2dad92

…egistry # Conflicts: # pyproject.toml # uv.lock

refactor: remove old provider implementations and process_response mo…

c69496e

…dule

jxnl and others added 9 commits January 26, 2026 10:25

refactor(core): unify mode registry and update core modules

e29437d

feat(v2): add provider utils modules for unified mode registry

c514136

refactor(v2): update handlers and core to use unified mode registry

de53c92

test: update tests for unified mode registry refactoring

c618bd0

docs: update migration and from_provider documentation

92d8fe9

docs(v2): clarify v1 provider locations (#2040)

c39f24b

refactor(providers): add missing providers, centralize URL detection,…

0b2851e

… add tests (#2039)

refactor(v2): inline provider utils into handlers

2397ed6

Move v2 provider helper logic into handler modules so request/parse behavior lives next to each mode handler and remove redundant utils modules.

chore: remove docs and examples from runtime PR

0a65bcd

Merge remote-tracking branch 'origin/main' into codex/pr7-runtime-only

c8689a0

# Conflicts: # instructor/providers/xai/client.py # instructor/providers/xai/utils.py # requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[codex] runtime-only mode-registry unification#2061

[codex] runtime-only mode-registry unification#2061
jxnl wants to merge 40 commits intomainfrom
codex/pr7-runtime-only

jxnl commented Feb 6, 2026 •

edited by cursor bot

Loading

Uh oh!

cloudflare-workers-and-pages bot commented Feb 6, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

jxnl commented Feb 6, 2026 • edited by cursor bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cloudflare-workers-and-pages bot commented Feb 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Deploying with Cloudflare Workers

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

jxnl commented Feb 6, 2026 •

edited by cursor bot

Loading

cloudflare-workers-and-pages bot commented Feb 6, 2026 •

edited

Loading