feat: schema-driven structured outputs across all LLM providers

## Summary

Add **schema-driven structured outputs** as a universal feature across all LLM providers. Users pass a Pydantic model OR a JSON Schema dict; Esperanto translates to each provider's native shape and returns the response normalized — including, for Pydantic input, an instantiated model on `response.choices[0].message.parsed`.

This complements (does not replace) the existing `structured={"type": "json"}` JSON-mode toggle.

## API Surface

### New per-call parameter on `chat_complete` / `achat_complete`

```python
def chat_complete(
    self,
    messages: List[Dict[str, Any]],
    ...
    response_schema: Optional[Union[Type[BaseModel], Dict[str, Any]]] = None,
) -> Union[ChatCompletion, Generator[ChatCompletionChunk, None, None]]:
```

- `None` (default): no schema constraint.
- `Type[BaseModel]`: a Pydantic v2 model class. Esperanto calls `Model.model_json_schema()` internally, sends to the provider, parses the JSON response, and instantiates the model.
- `Dict[str, Any]`: raw JSON Schema dict. Esperanto sends as-is; the response's `parsed` is the parsed JSON dict.

### Response shape

```python
response = model.chat_complete(messages, response_schema=Event)
event = response.choices[0].message.parsed   # Event instance (Pydantic) or dict
raw = response.choices[0].message.content    # JSON string (always populated)
```

`parsed` is added to the message type. `content` continues to hold the raw JSON string for backward compat.

### Coexistence with existing `structured` flag

- `structured={"type": "json"}` — instance-level JSON mode (any valid JSON). **Unchanged.**
- `response_schema=X` — per-call schema-driven (specific shape). **New.**
- Precedence: per-call `response_schema` overrides instance-level `structured` for that call only.
- Both features remain useful and supported.

## Per-Provider Translation

Each provider's request-builder converts the canonical JSON Schema dict to the provider's native shape:

| Provider | Native shape |
|---|---|
| **OpenAI / Azure / openai-compatible** | `response_format={"type": "json_schema", "json_schema": {"name": "<schema-name>", "strict": true, "schema": <dict>}}` |
| **Anthropic** | `output_format={"type": "json_schema", "schema": <dict>}` *(per Anthropic's structured outputs API)* |
| **Google / Vertex** | `generation_config = {"response_schema": <dict>, "response_mime_type": "application/json"}` (handle Gemini-specific schema type renaming as needed) |
| **Mistral** | `response_format={"type": "json_schema", "json_schema": {...}}` |
| **Ollama** | `format=<dict>` (newer Ollama; works with most local models) |
| **OpenRouter / DeepSeek / Groq / xAI / DashScope / MiniMax** | Same as openai-compatible (response_format json_schema). Per-profile `supports_response_format` honored. |
| **Perplexity** | Same as openai-compatible. |

A shared helper module `src/esperanto/utils/structured_output.py` provides:
- `pydantic_to_schema(model_or_dict) -> Dict[str, Any]` — unified normalization
- `parse_structured_response(content, response_schema) -> parsed` — JSON parse + optional Pydantic instantiation
- Schema massaging for OpenAI strict mode (ensure `additionalProperties: false`, all fields required where Pydantic allows)

Provider request-builders call into this helper rather than reimplementing.

## Output Normalization

- Provider returns content as a JSON-encoded string.
- `parse_structured_response()`:
  - `json.loads(content)` → dict
  - If `response_schema` was a Pydantic class: `model.model_validate(parsed_dict)` → instance
  - If `response_schema` was a dict: return the parsed dict
- Result attached to `response.choices[0].message.parsed`.

## Files

### New
- `src/esperanto/utils/structured_output.py` — schema helpers + parser
- `tests/utils/test_structured_output.py` — helper tests

### Modified
- `src/esperanto/providers/llm/base.py` — add `response_schema` to `chat_complete` / `achat_complete` signatures
- `src/esperanto/common_types/response.py` (or wherever `Message` lives) — add `parsed: Optional[Any]` to message type
- All provider implementations in `src/esperanto/providers/llm/*.py` — accept the new param and route through the per-provider translation
- `tests/providers/llm/test_*.py` — per-provider tests for both Pydantic and dict input, both sync and async
- `docs/features/structured-output.md` (NEW) or extend an existing structured-output doc

## Acceptance Criteria

- `model.chat_complete(messages, response_schema=PydanticClass)` returns an instance of `PydanticClass` on `response.choices[0].message.parsed` for **every supported provider**
- Same call with a JSON Schema dict returns the parsed dict
- Async equivalent works identically
- `content` field continues to hold the raw JSON string (backward compat)
- Existing `structured={"type": "json"}` JSON mode behavior is unchanged
- Per-call `response_schema` correctly overrides instance-level `structured` for that call
- Provider tests cover at least: openai, anthropic, google, mistral, ollama, openai-compatible (covers profiles)
- Documentation includes a worked example per major provider

## Caveats / Documented Limitations

- **OpenAI strict mode** requires `additionalProperties: false` and all fields required. Some Pydantic patterns (Optional fields without defaults) need massaging. Helper handles this; document the supported subset.
- **Discriminated unions** vary in support across providers. Document which work where.
- **Streaming + structured output** — out of scope for v1. Document and raise a clear error if `stream=True` and `response_schema` are both set, until a follow-up issue tackles it.
- **Pydantic v1** not supported (we already require v2).

## Out of Scope (Separate Issues)

- **OpenAI Responses API migration** — see #99 (currently `awaiting-demand`). Not needed for structured output: `response_format=json_schema` on chat-completions delivers parity.
- **Streaming structured output** — separate follow-up if/when demand emerges.
- **Tool-call schema validation** — different feature; tools already have their own JSON Schema params.

## References

- OpenAI structured outputs: https://platform.openai.com/docs/guides/structured-outputs
- Anthropic structured outputs: https://docs.claude.com/en/docs/build-with-claude/structured-outputs
- Google Gemini structured output: https://ai.google.dev/gemini-api/docs/structured-output
- Mistral JSON mode + JSON Schema: https://docs.mistral.ai/capabilities/structured-output/
- Ollama structured outputs: https://github.com/ollama/ollama/blob/main/docs/api.md#request-structured-outputs
- Closes #111 (Anthropic structured output — covered by this cross-provider implementation)
- Related: #99 (OpenAI Responses API — `awaiting-demand`)
- Principles: ARCHITECTURE.md → Provider Parity, Graceful Degradation


Provider	Native shape
OpenAI / Azure / openai-compatible	`response_format={"type": "json_schema", "json_schema": {"name": "<schema-name>", "strict": true, "schema": <dict>}}`
Anthropic	`output_format={"type": "json_schema", "schema": <dict>}` (per Anthropic's structured outputs API)
Google / Vertex	`generation_config = {"response_schema": <dict>, "response_mime_type": "application/json"}` (handle Gemini-specific schema type renaming as needed)
Mistral	`response_format={"type": "json_schema", "json_schema": {...}}`
Ollama	`format=<dict>` (newer Ollama; works with most local models)
OpenRouter / DeepSeek / Groq / xAI / DashScope / MiniMax	Same as openai-compatible (response_format json_schema). Per-profile `supports_response_format` honored.
Perplexity	Same as openai-compatible.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: schema-driven structured outputs across all LLM providers #94

Summary

API Surface

New per-call parameter on `chat_complete` / `achat_complete`

Response shape

Coexistence with existing `structured` flag

Per-Provider Translation

Output Normalization

Files

New

Modified

Acceptance Criteria

Caveats / Documented Limitations

Out of Scope (Separate Issues)

References

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

feat: schema-driven structured outputs across all LLM providers #94

Description

Summary

API Surface

New per-call parameter on chat_complete / achat_complete

Response shape

Coexistence with existing structured flag

Per-Provider Translation

Output Normalization

Files

New

Modified

Acceptance Criteria

Caveats / Documented Limitations

Out of Scope (Separate Issues)

References

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions

New per-call parameter on `chat_complete` / `achat_complete`

Coexistence with existing `structured` flag