Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,13 @@ Versioning: [Semantic Versioning](https://semver.org/spec/v2.0.0.html)

## [Unreleased]

### Changed
- **Provider architecture**: Native provider factories now declare SDK client contracts and delegate shared validation, mode normalization, sync/async selection, streaming dispatch, patching, and wrapper construction to one implementation.
- **Provider tests**: Replace repeated provider client suites with manifest-driven behavioral contracts, including async streaming parity.

### Fixed
- **Hooks**: Emit `completion:error` and `completion:last_attempt` consistently for raw and structured provider failures, including attempt metadata.
- **Documentation**: Correct Mistral and Bedrock async examples and clarify that provider-specific native-client helpers remain supported.
- **v2 cleanup**: Consolidate small provider/runtime fixes for Gemini JSON prompts, Cohere templating, JSON array extraction, iterable streaming, missing `jsonref` dependency guidance, retry semantics and hook metadata, and multimodal autodetection.

### Tests / CI
Expand Down
449 changes: 295 additions & 154 deletions docs/blog/posts/whats-new-in-v2.md

Large diffs are not rendered by default.

2 changes: 1 addition & 1 deletion docs/concepts/citation.md
Original file line number Diff line number Diff line change
Expand Up @@ -185,6 +185,6 @@ Use CitationMixin when:
## See Also

- [Validation](./validation.md) - Learn about validation in Instructor
- [Context-Based Validation](./validation.md#context-based-validation) - Using context for validation
- [Custom Validators](./validation.md#custom-validators) - Using context for validation
- [Citation Examples](../examples/exact_citations.md) - More citation examples
- [RAG Patterns](../blog/posts/rag-and-beyond.md) - Building RAG systems with Instructor
10 changes: 9 additions & 1 deletion docs/concepts/from_provider.md
Original file line number Diff line number Diff line change
Expand Up @@ -331,13 +331,21 @@ client = instructor.from_provider("openai/gpt-4o-mini")

### from_provider vs. Provider-Specific Functions

Provider-specific helpers were removed. Use `from_provider` for all clients:
Provider-specific helpers remain supported for applications that already construct
native SDK clients. Use `from_provider` when a model string is the most convenient
entrypoint; use `from_openai`, `from_anthropic`, and the other `from_*` helpers when
you need to configure the underlying SDK client directly.

```python
import instructor

openai_client = instructor.from_provider("openai/gpt-4o-mini")
anthropic_client = instructor.from_provider("anthropic/claude-3-5-sonnet")

# Existing native-client factories remain public compatibility entrypoints.
from openai import OpenAI

native_client = instructor.from_openai(OpenAI())
```

## Troubleshooting
Expand Down
5 changes: 5 additions & 0 deletions docs/concepts/hooks.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,11 @@ Hooks let you intercept and handle events during the completion and parsing proc

`completion:error` and `completion:last_attempt` handlers receive optional retry metadata as keyword arguments. Old-style handlers that only accept `error` continue to work — the metadata is silently dropped for backward compatibility.

For custom Tenacity policies, `max_attempts` is populated when the policy uses a
direct `stop_after_attempt` limit. With predicate- or delay-based stop conditions,
the runtime may not know that an error is terminal when `completion:error` is
emitted; use `completion:last_attempt` as the authoritative exhaustion signal.

## Registering and Removing Hooks

```python
Expand Down
17 changes: 10 additions & 7 deletions docs/concepts/migration.md
Original file line number Diff line number Diff line change
Expand Up @@ -100,7 +100,7 @@ user = await client.create(...)
### Anthropic

```python
# Before (removed)
# Before (still supported)
import anthropic
from instructor import from_anthropic

Expand All @@ -115,7 +115,7 @@ user = client.create(...)
### Google/Gemini

```python
# Before (removed)
# Before (still supported)
import google.genai as genai
from instructor import from_genai

Expand Down Expand Up @@ -188,13 +188,16 @@ anthropic_client = instructor.from_provider("anthropic/claude-3-5-sonnet")

## Backward Compatibility

Legacy helpers have been removed:
V2 keeps the established v1 factory and patch import paths as lazy
compatibility facades. Existing applications do not need a flag-day rewrite:

- `instructor.patch()` → Use `from_provider` instead
- `instructor.apatch()` → Use `from_provider` with `async_client=True`
- `from_openai()`, `from_anthropic()`, etc. → Use `from_provider`
- `instructor.patch()` and `instructor.apatch()` continue to work.
- `from_openai()`, `from_anthropic()`, and other provider factories continue to work.
- Provider-specific legacy modes normalize to the corresponding core modes with
deprecation warnings.

Update all call sites before upgrading.
Prefer `from_provider()` and core modes in new code because their capability
contract and type surface receive the new conformance checks.

## See Also

Expand Down
4 changes: 2 additions & 2 deletions docs/concepts/mode-migration.md
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,7 @@ Use this table to replace legacy modes:
| `FUNCTIONS` | `TOOLS` |
| `TOOLS_STRICT` | `TOOLS` |
| `ANTHROPIC_TOOLS` | `TOOLS` |
| `ANTHROPIC_JSON` | `MD_JSON` |
| `ANTHROPIC_JSON` | `JSON` |
| `COHERE_TOOLS` | `TOOLS` |
| `COHERE_JSON_SCHEMA` | `JSON_SCHEMA` |
| `XAI_TOOLS` | `TOOLS` |
Expand Down Expand Up @@ -99,7 +99,7 @@ from instructor import Mode

client = instructor.from_provider(
"bedrock/anthropic.claude-3-5-sonnet-20241022-v2:0",
mode=Mode.BEDROCK_TOOLS,
mode=Mode.TOOLS,
)
```

Expand Down
198 changes: 198 additions & 0 deletions docs/concepts/provider-architecture-parity.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,198 @@
# Provider Architecture and Parity Inventory

This page records the provider architecture introduced by the
`provider-owned-clients` change and the compatibility decisions verified on
June 7, 2026. The measured baseline is commit `0212d8833cba`, immediately
before the shared native-client factory migration.

The implementation, documentation, and deterministic contract tests are split
into three reviewable changes. The documentation and tests changes both target
the same core implementation commit; the test evidence and measurements below
are supplied by the companion tests change rather than duplicated here.

## Architecture decisions

- `ProviderSpec` is the provider control plane. It declares aliases, canonical
provider identity, supported and legacy modes, handler modules, builders,
capabilities, dependency errors, and the native client contract.
- `ClientSpec` is an internal declarative contract for native sync and async SDK
types, create and stream method paths, model fallback behavior, and legacy
validation precedence. It is not a public export.
- `create_instructor()` is the single implementation of native client
validation, mode normalization, sync/async selection, method resolution,
stream switching, patching, model defaults, and wrapper construction.
- The mode registry remains the single runtime lookup for request, re-ask,
response, sync stream, async stream, message, and template handlers. Legacy
provider ownership and normalization are derived from `ProviderSpec`.
- `retry_sync_v2()` and `retry_async_v2()` own validation retries, re-asks,
usage accumulation, hook emission, error propagation, and final response
attachment for every retained provider.
- Provider modules retain only SDK construction and behavior that cannot be
expressed by the common contract. OpenAI Responses, xAI, and LiteLLM remain
explicit compatibility exceptions.

## Public export inventory

Every name in `instructor.__all__` remains available. Optional factories remain
conditional on their SDK being importable, matching the previous behavior.

| Export | Result |
| --- | --- |
| `Instructor` | Preserved |
| `AsyncInstructor` | Preserved |
| `Image` | Preserved |
| `Audio` | Preserved |
| `from_openai` | Preserved |
| `from_anyscale` | Preserved |
| `from_together` | Preserved |
| `from_databricks` | Preserved |
| `from_deepseek` | Preserved |
| `from_openrouter` | Preserved |
| `from_litellm` | Preserved |
| `from_vertexai` | Preserved; deprecated provider guidance is unchanged |
| `from_provider` | Preserved |
| `Provider` | Preserved |
| `ResponseSchema` | Preserved |
| `response_schema` | Preserved |
| `OpenAISchema` | Preserved |
| `CitationMixin` | Preserved |
| `IterableModel` | Preserved |
| `Maybe` | Preserved |
| `Partial` | Preserved |
| `openai_schema` | Preserved |
| `generate_openai_schema` | Preserved |
| `generate_anthropic_schema` | Preserved |
| `generate_gemini_schema` | Preserved |
| `Mode` | Preserved |
| `patch` | Preserved |
| `apatch` | Preserved |
| `FinetuneFormat` | Preserved |
| `Instructions` | Preserved |
| `BatchProcessor` | Preserved |
| `BatchRequest` | Preserved |
| `BatchJob` | Preserved |
| `llm_validator` | Preserved |
| `openai_moderation` | Preserved |
| `hooks` | Preserved |
| `v2` | Preserved |
| `from_anthropic` | Preserved as an optional export |
| `from_gemini` | Preserved as an optional export |
| `from_fireworks` | Preserved as an optional export |
| `from_cerebras` | Preserved as an optional export |
| `from_groq` | Preserved as an optional export |
| `from_mistral` | Preserved as an optional export |
| `from_cohere` | Preserved as an optional export |
| `from_bedrock` | Preserved as an optional export |
| `from_writer` | Preserved as an optional export |
| `from_xai` | Preserved as an optional export |
| `from_perplexity` | Preserved as an optional export |
| `from_genai` | Preserved as an optional export |

`instructor.__version__` also remains available. No public export was removed.

## Provider and mode inventory

The table lists canonical modes after normalization. Existing provider-specific
mode enum values remain accepted as deprecated aliases where declared by the
provider specification.

| Provider or alias | Factory | Canonical modes | Result |
| --- | --- | --- | --- |
| OpenAI | `from_openai` | `TOOLS`, `JSON`, `JSON_SCHEMA`, `MD_JSON`, `PARALLEL_TOOLS`, `RESPONSES_TOOLS` | Preserved |
| Anyscale | `from_anyscale` | `TOOLS`, `JSON`, `JSON_SCHEMA`, `MD_JSON`, `PARALLEL_TOOLS` | Preserved through OpenAI-compatible handlers |
| Together | `from_together` | `TOOLS`, `JSON`, `JSON_SCHEMA`, `MD_JSON`, `PARALLEL_TOOLS` | Preserved through OpenAI-compatible handlers |
| Databricks | `from_databricks` | `TOOLS`, `JSON`, `JSON_SCHEMA`, `MD_JSON`, `PARALLEL_TOOLS` | Preserved through OpenAI-compatible handlers |
| DeepSeek | `from_deepseek` | `TOOLS`, `JSON`, `JSON_SCHEMA`, `MD_JSON`, `PARALLEL_TOOLS` | Preserved through OpenAI-compatible handlers |
| OpenRouter | `from_openrouter` | `TOOLS`, `JSON_SCHEMA`, `MD_JSON`, `PARALLEL_TOOLS` | Preserved |
| Anthropic | `from_anthropic` | `TOOLS`, `JSON`, `JSON_SCHEMA`, `PARALLEL_TOOLS` | Preserved, including beta client routing |
| Google GenAI (`google/...`) | `from_genai` | `TOOLS`, `JSON` | Preserved |
| `generative-ai` | `from_provider` alias | Canonicalizes to Google GenAI | Preserved deprecated alias |
| Gemini | `from_gemini` | `TOOLS`, `MD_JSON` | Preserved deprecated SDK integration |
| Cohere | `from_cohere` | `TOOLS`, `JSON_SCHEMA`, `MD_JSON` | Preserved, including V1/V2 and split stream methods |
| Perplexity | `from_perplexity` | `MD_JSON` | Preserved |
| xAI | `from_xai` | `TOOLS`, `JSON_SCHEMA`, `MD_JSON`, `PARALLEL_TOOLS` | Preserved custom adapter |
| Groq | `from_groq` | `TOOLS`, `JSON_SCHEMA`, `MD_JSON` | Preserved |
| Mistral | `from_mistral` | `TOOLS`, `JSON_SCHEMA`, `MD_JSON` | Preserved, including split stream methods |
| Fireworks | `from_fireworks` | `TOOLS`, `JSON_SCHEMA`, `MD_JSON` | Preserved |
| Cerebras | `from_cerebras` | `TOOLS`, `JSON_SCHEMA`, `MD_JSON`, `PARALLEL_TOOLS` | Preserved |
| Writer | `from_writer` | `TOOLS`, `JSON_SCHEMA`, `MD_JSON` | Preserved |
| Bedrock | `from_bedrock` | `TOOLS`, `MD_JSON` | Preserved, including async wrapping of the sync SDK method |
| Vertex AI | `from_vertexai` | `TOOLS`, `MD_JSON`, `PARALLEL_TOOLS` | Preserved deprecated provider path |
| Azure OpenAI | `from_provider` alias | Canonicalizes to OpenAI | Preserved compatibility alias |
| Ollama | `from_provider` alias | Canonicalizes to OpenAI | Preserved compatibility alias |
| LiteLLM | `from_litellm` | Callable wrapper modes | Preserved custom adapter |

Unsupported provider/mode combinations continue to raise `ModeError`; they are
not advertised as capabilities by `ProviderSpec`.

## Workflow parity

| Workflow | Result | Deterministic evidence |
| --- | --- | --- |
| Sync structured extraction | Preserved | Shared provider/client, handler, patch, and retry suites |
| Async structured extraction | Preserved | Shared wrapper selection and async retry suites |
| Pydantic response validation | Preserved | Shared response parser and retry runtime suites |
| Full streaming | Preserved | Provider capability runtime contracts |
| Async streaming | Preserved | Async extractor contract for every advertised stream mode |
| `Partial[T]` streaming | Preserved | DSL and provider capability suites |
| `Iterable[T]` responses and streaming | Preserved | DSL, response-list, and provider capability suites |
| Retries and re-asks | Preserved | Central sync/async retry runtime suites |
| `completion:kwargs` and `completion:response` hooks | Preserved | Retry runtime suites |
| `completion:error` and `completion:last_attempt` hooks | Intentionally changed to match the documented contract | Raw and structured failure hook tests |
| Usage accumulation and reporting | Preserved | Shared retry usage paths and provider usage tests |
| `Maybe`, citations, schemas, multimodal DSL | Preserved | Public typing and deterministic DSL suites |
| Automatic provider detection | Preserved | `from_provider` and provider URL detection suites |
| Provider-specific factory imports | Preserved | Lazy public import and legacy compatibility suites |
| Documented Mistral async workflow | Preserved and corrected | Uses `async_client=True` explicitly |
| Documented Bedrock async workflow | Preserved and corrected | Uses `async_client=True` explicitly |

## Intentional changes and removals

| Item | Result | Evidence and migration |
| --- | --- | --- |
| Provider-local sync/async validation, patching, stream switching, and wrapper construction | Removed as obsolete duplication | Replaced by `ClientSpec` and `create_instructor()`; public factory signatures are unchanged |
| Repeated Bedrock, Cerebras, Fireworks, Groq, Mistral, and Writer client suites | Removed as meaningless duplication | Replaced by the manifest-driven shared client contract; provider-specific handler tests remain |
| Bedrock `_async` documentation | Removed as broken or obsolete | `_async` was not a supported selector; pass `async_client=True` to `from_provider()` or use the async factory overload |
| Claim that provider-specific helpers were removed | Removed as incorrect documentation | Native-client helpers remain public; prefer `from_provider()` for model-string construction |
| Vertex AI deterministic test without the optional SDK | Intentionally changed | The test now skips when `vertexai` is unavailable instead of failing during patch setup |
| Final failure hook behavior | Intentionally changed to restore documented behavior | `completion:error` and `completion:last_attempt` now receive attempt metadata; existing hook signatures remain backward-compatible |

No provider, working mode, DSL type, retry path, stream workflow, hook name, or
documented supported factory was removed.

## Measurements

Measurements compare the working tree with `0212d8833cba` using the same
fully populated `.venv`, with proxy variables unset. Generated files and
dependency locks are excluded. Production LOC counts nonblank, non-comment
implementation-body lines and excludes overloads and docstrings; test LOC is
physical file length for the replaced contract slice.

| Slice | Baseline | Current | Reduction |
| --- | ---: | ---: | ---: |
| Executable body LOC in the 11 migrated `from_*` implementations | 589 LOC | 150 LOC | 74.5% |
| Provider client contract tests replaced by `test_client_unified.py` | 869 LOC | 260 LOC | 70.1% |

The selected deterministic parity command is:

```bash
python -m pytest \
tests/v2/test_provider_specs.py \
tests/v2/test_client_unified.py \
tests/v2/test_auto_client_deterministic.py \
tests/providers/test_auto_client.py -q
```

For the selected parity command, three baseline runs from an isolated archive
had a 16.68-second median with 513 passing and 46 skipped. Three working-tree
runs had a 16.28-second median with 101 passing and 27 skipped, a 2.4% runtime
reduction. The complete deterministic `tests/v2 tests/providers` suite improved
from a 34.76-second three-run median (1,644 passing, 199 skipped) to a
32.88-second median (1,279 passing, 170 skipped), a 5.4% reduction. Optional SDK
imports and pytest startup dominate both measurements, so the 75% timing target
remains unmet even though repeated test cases and provider implementations were
removed.

Browser automation verified the rendered `from_provider`, Mistral, Bedrock, and
provider parity pages on the local MkDocs site. The parity inventory rendered
all five tables without console errors.
9 changes: 4 additions & 5 deletions docs/integrations/bedrock.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@ pip install "instructor[bedrock]"
- [from_provider Guide](../concepts/from_provider.md) - Detailed client configuration
- [Mode Migration Guide](../concepts/mode-migration.md) - Move to core modes
- [Provider Examples](../index.md#provider-examples) - Quick examples for all providers
- [AWS Integration Guide](../examples/index.md#aws-integration) - More AWS examples
- [Examples](../examples/index.md) - More integration examples

# AWS Bedrock

Expand All @@ -43,11 +43,10 @@ client = instructor.from_provider("bedrock/anthropic.claude-3-5-sonnet-20241022-
# - Mode selection based on model (Claude models use TOOLS)
```

## Deprecation Notice
## Async Clients

> **Deprecation Notice:**
>
> The `_async` argument to `instructor.from_bedrock` is deprecated. Please use `async_client=True` for async clients instead. Support for `_async` may be removed in a future release. All new code and examples should use `async_client`.
Use `async_client=True` with `from_provider` or `from_bedrock`. The obsolete
`_async` keyword is no longer accepted as an async selector.

### Environment Configuration

Expand Down
12 changes: 6 additions & 6 deletions docs/integrations/genai.md
Original file line number Diff line number Diff line change
Expand Up @@ -545,13 +545,13 @@ print(response)

## Streaming Responses

!!! warning "Streaming Limitations"
!!! note "V2 streaming contract"

**As of July 11, 2025, Google GenAI does not support streaming with tool/function calling or structured outputs for regular models.**

- `Mode.TOOLS` and `Mode.JSON` do not support streaming with regular models
- To use streaming, you must use `Partial[YourModel]` explicitly or switch to other modes like `Mode.JSON`
- Alternatively, set `stream=False` to disable streaming
The v2 GenAI client advertises public partial and iterable streaming for
`Mode.TOOLS` and `Mode.JSON`. This contract covers Instructor's request and
parsing path; select a Gemini model that supports the requested operation.
The legacy `gemini` and `vertexai` provider families have separate feature
contracts and do not currently expose public iterable streaming.

Streaming allows you to process responses incrementally rather than waiting for the complete result. This is extremely useful for making UI changes feel instant and responsive.

Expand Down
Loading
Loading