Feature Request: Expand tool_call to an object for granular capabilities

# Feature Request: Expand tool_call to an object for granular capabilities

## The Problem

The current `models.dev` schema defines `tool_call` as a simple `boolean`. This is great for simple cases but doesn't capture provider-specific limitations or track important capability details, forcing consumer libraries to maintain their own local override lists.

### Concrete Example: Meta Llama 3.3 70B on Amazon Bedrock

The model `meta.llama3-3-70b-instruct-v1:0` when served via Amazon Bedrock has `tool_call = true` in models.dev, but exhibits documented provider-level failures:

#### 1. **Streaming + Tools Not Supported**

**Issue:** Bedrock API throws `HTTP 400` error: `"This model doesn't support tool use in streaming mode"`

**Evidence:**

- AWS re:Post: [Using Bedrock with Mistral 2 Large, converse API with tools would not let me use streaming feature](https://repost.aws/questions/QU9LG7D9x9Sp-Rucl-AIeFVw/using-bedrock-with-mistral-2-large-converse-api-with-tools-would-not-let-me-use-streaming-feature)
- langchain-aws Issue #140: [Tool Calling Issue in AWS Bedrock Integration](https://github.com/langchain-ai/langchain-aws/issues/140)
- langchain-aws Issue #354: [Streaming validator incorrectly disables streaming for Meta models](https://github.com/langchain-ai/langchain-aws/issues/354)

**Affected Models:** Not just Llama 3.3 - this affects Llama 3.1, Mistral Large, and other non-Anthropic models on Bedrock.

#### 2. **Type Coercion Issues**

**Issue:** Bedrock coerces integers to strings in tool parameters, and drops integers ≥ 2^31 entirely (returns empty `toolUse.input = {}`).

**Evidence:**

- AWS re:Post: [Bedrock Converse API with tool use drop integers ≥ 2^31 (toolUse.input becomes {}) and sometimes coerce to strings — repro on Llama 3.1 & Mistral Large](https://repost.aws/questions/QUYs4gW8yMSzuaXNzgDaGMGQ/bedrock-converse-api-with-tool-use-drop-integers-%E2%89%A5-2-31-tool-use-input-becomes-and-sometimes-coerce-to-strings-repro-on-llama-3-1-mistral-large)

**AWS Response:** "This behavior is actually a model-specific characteristic rather than a validator issue."

**Impact:** Breaks common cases like epoch-millisecond timestamps in tool parameters.

#### 3. **Response Format Issues**

**Issue:** Returns tool calls as JSON text strings instead of structured `tool_calls` objects.

**Evidence:**

- pydantic-ai Issue #1649: [Llama 3.3 Tool Calling on Bedrock](https://github.com/pydantic/pydantic-ai/issues/1649)
- StackOverflow: [Inconsistent tool calling behavior with Llama 3.1 70B model on AWS Bedrock](https://stackoverflow.com/questions/79247654/inconsistent-tool-calling-behavior-with-llama-3-1-70b-model-on-aws-bedrock)

**Quote from pydantic-ai issue:**

> "The model outputs the tool call as a JSON string within regular text content rather than using the proper `tool_calls` message structure"

### Additional Use Case: Parallel Tool Calling Tracking

Issue #202 requests tracking which models support parallel tool calling. Currently there's no way to document this capability in models.dev, making it difficult for developers to choose models based on throughput requirements.

### Impact on Consumers

A consumer library like [`req_llm`](https://github.com/agentjido/req_llm) that relies on models.dev data will:

- Incorrectly believe streaming tools are supported → tests fail with HTTP 400
- Expect integer schema validation to work → tests fail with type mismatches
- Expect structured tool responses → parsing fails
- Cannot determine parallel tool support → requires manual testing

**Current workaround:** Maintain provider-specific override lists in application code, defeating the purpose of a centralized model metadata API.

See: [req_llm Issue #163: Capability Mismatch Problem](https://github.com/agentjido/req_llm/issues/163)

---

## The Proposed Solution

Change the `zod` schema for `tool_call` to be a union of a `boolean` OR an `object` with granular flags.

**This solution is 100% backward-compatible.** All existing TOML files with `tool_call = true` or `tool_call = false` will remain valid.

### Schema Change

```typescript
// Before
tool_call: z.boolean();

// After
tool_call: z.union([
  z.boolean(),
  z
    .object({
      supported: z.boolean(),
      streaming: z.boolean().optional(),
      parallel: z.boolean().optional(),
      coerces_types: z.boolean().optional(),
    })
    .strict(),
]);
```

### Example Usage

```toml
# providers/amazon-bedrock/models/meta.llama3-3-70b-instruct-v1:0.toml

[tool_call]
supported = true
streaming = false       # Documents the streaming limitation
coerces_types = true    # Documents the type coercion behavior
```

Note: `parallel` is not specified in this example because we have no evidence either way for this model. When undefined, it defaults based on the semantics below.

### Default Semantics

For API consumers (like `req_llm`):

- If `tool_call = false`: All tool sub-capabilities are false
- If `tool_call = true` (boolean): All sub-capabilities work correctly (maintains current assumptions)
  - `streaming = true` (default - most models support this)
  - `parallel = false` (default - not universally supported, safer to assume no)
  - `coerces_types = false` (default - most models don't have this bug)
- If `tool_call = {...}` (object):
  - Use `supported` flag as base truth
  - If `streaming` undefined → default to `true` (matches current `tool_call = true` assumption)
  - If `parallel` undefined → default to `false` (safer, not universal)
  - If `coerces_types` undefined → default to `false` (most models work correctly)

**Rationale for defaults:**

- `streaming = true`: Current `tool_call = true` already implies streaming works. This is the norm.
- `parallel = false`: Parallel tool calling is NOT universal. Safer to assume not supported unless proven.
- `coerces_types = false`: Most models handle types correctly. This documents the exception.

This makes the granular flags "opt-out" for exceptions. You only specify fields when they differ from the defaults.

---

## Benefits

1. **Eliminates local overrides** - Consumer libraries can trust models.dev data
2. **Documents real-world behavior** - Provider quirks are explicit, not hidden
3. **Enables parallel tool tracking** - Addresses #202
4. **100% backward compatible** - Existing TOML files need no changes, defaults match current assumptions
5. **Opt-in complexity** - Simple cases stay simple
6. **Precedent exists** - models.dev has evolved schemas before (modalities restructure, cache field renames)

---

## Alternative Considered

Keep `tool_call` as boolean and add separate top-level fields like `tool_streaming`, `tool_parallel`, `tool_coerces_types`.

**Rejected because:** Pollutes top-level schema with tool-specific details; the union approach is cleaner and groups related metadata.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Feature Request: Expand tool_call to an object for granular capabilities #342

Feature Request: Expand tool_call to an object for granular capabilities

The Problem

Concrete Example: Meta Llama 3.3 70B on Amazon Bedrock

1. Streaming + Tools Not Supported

2. Type Coercion Issues

3. Response Format Issues

Additional Use Case: Parallel Tool Calling Tracking

Impact on Consumers

The Proposed Solution

Schema Change

Example Usage

Default Semantics

Benefits

Alternative Considered

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Feature Request: Expand tool_call to an object for granular capabilities #342

Description

Feature Request: Expand tool_call to an object for granular capabilities

The Problem

Concrete Example: Meta Llama 3.3 70B on Amazon Bedrock

1. Streaming + Tools Not Supported

2. Type Coercion Issues

3. Response Format Issues

Additional Use Case: Parallel Tool Calling Tracking

Impact on Consumers

The Proposed Solution

Schema Change

Example Usage

Default Semantics

Benefits

Alternative Considered

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions