Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions providers/aws-bedrock/amazon.rerank-v1:0.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -22,9 +22,12 @@ removeParams:
- "n"
- stop
- stream
- reasoning_effort
sources:
- https://docs.aws.amazon.com/bedrock/latest/userguide/rerank-supported.html
- https://docs.aws.amazon.com/bedrock/latest/userguide/rerank.html
- https://docs.aws.amazon.com/bedrock/latest/userguide/rerank-pricing.html
- https://docs.aws.amazon.com/bedrock/latest/userguide/rerank-use.html
status: active
supportedModes:
- rerank
2 changes: 2 additions & 0 deletions providers/aws-bedrock/apac.amazon.nova-micro-v1:0.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -50,6 +50,8 @@ params:
maxValue: 5000
minValue: 1
provisioning: serverless
removeParams:
- reasoning_effort
sources:
- https://docs.aws.amazon.com/nova/latest/userguide/what-is-nova.html
- https://docs.aws.amazon.com/bedrock/latest/userguide/cross-region-inference-support.html
Expand Down
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
costs:
- cache_creation_input_token_cost: 0.000004125
cache_creation_input_token_cost_per_hour: 0.0000066
cache_read_input_token_cost: 3.3e-7
input_cost_per_token: 0.0000033
input_cost_per_token_batches: 0.00000165
Expand All @@ -21,6 +22,7 @@ costs:
from: 200000
pricing_mode: marginal
- cache_creation_input_token_cost: 0.000004125
cache_creation_input_token_cost_per_hour: 0.0000066
cache_read_input_token_cost: 3.3e-7
input_cost_per_token: 0.0000033
input_cost_per_token_batches: 0.00000165
Expand All @@ -42,6 +44,7 @@ costs:
from: 200000
pricing_mode: marginal
- cache_creation_input_token_cost: 0.000004125
cache_creation_input_token_cost_per_hour: 0.0000066
cache_read_input_token_cost: 3.3e-7
input_cost_per_token: 0.0000033
input_cost_per_token_batches: 0.00000165
Expand Down
1 change: 0 additions & 1 deletion providers/aws-bedrock/cohere.embed-english-v3.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,6 @@ limits:
modalities:
input:
- text
- image
output:
- embedding
mode: embedding
Expand Down
4 changes: 2 additions & 2 deletions providers/aws-bedrock/eu.amazon.nova-micro-v1:0.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -59,11 +59,11 @@ params:
maxValue: 5000
minValue: 1
provisioning: serverless
removeParams:
- reasoning_effort
sources:
- https://docs.aws.amazon.com/nova/latest/userguide/what-is-nova.html
- https://docs.aws.amazon.com/bedrock/latest/userguide/cross-region-inference-support.html
- https://docs.aws.amazon.com/bedrock/latest/userguide/conversation-inference-supported-models-features.html
- https://docs.aws.amazon.com/bedrock/latest/userguide/prompt-caching.html
- https://docs.aws.amazon.com/bedrock/latest/userguide/model-card-amazon-nova-micro.html
status: active
supportedModes:
Expand Down
2 changes: 1 addition & 1 deletion providers/aws-bedrock/eu.anthropic.claude-sonnet-4-6.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -74,7 +74,7 @@ limits:
context_window: 1000000
max_output_tokens: 64000
max_tokens: 64000
tool_use_system_prompt_tokens: 346
tool_use_system_prompt_tokens: 497
modalities:
input:
- text
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -439,6 +439,28 @@ costs:
- cost_per_token: 0.0000225
from: 200000
pricing_mode: marginal
- cache_creation_input_token_cost: 0.00000375
cache_creation_input_token_cost_per_hour: 0.000006
cache_read_input_token_cost: 3e-7
input_cost_per_token: 0.000003
input_cost_per_token_batches: 0.0000015
output_cost_per_token: 0.000015
output_cost_per_token_batches: 0.0000075
region: ap-southeast-6
tiered_pricing:
cache_read:
- cost_per_token: 6e-7
from: 200000
cache_write:
- cost_per_token: 0.0000075
from: 200000
input:
- cost_per_token: 0.000006
from: 200000
output:
- cost_per_token: 0.0000225
from: 200000
pricing_mode: marginal
- cache_creation_input_token_cost: 0.00000375
cache_creation_input_token_cost_per_hour: 0.000006
cache_read_input_token_cost: 3e-7
Expand Down
5 changes: 5 additions & 0 deletions providers/aws-bedrock/google.gemma-3-27b-it.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,9 @@ costs:
- input_cost_per_token: 2.7e-7
output_cost_per_token: 4.5e-7
region: eu-south-1
- input_cost_per_token: 1.7e-7
output_cost_per_token: 4.8e-7
region: eu-central-1
- input_cost_per_token: 2.369e-7
output_cost_per_token: 3.914e-7
region: ap-southeast-2
Expand Down Expand Up @@ -50,6 +53,8 @@ params:
maxValue: 8192
minValue: 1
provisioning: serverless
removeParams:
- reasoning_effort
sources:
- https://docs.aws.amazon.com/bedrock/latest/userguide/model-card-google-gemma-3-27b-pt.html
- https://ai.google.dev/gemma/docs/core
Expand Down
1 change: 1 addition & 0 deletions providers/aws-bedrock/luma.ray-v2:0.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@ removeParams:
- "n"
- stop
- stream
- reasoning_effort
sources:
- https://lumalabs.ai/learning-hub
- https://docs.aws.amazon.com/bedrock/latest/userguide/model-parameters-luma.html
Expand Down
2 changes: 2 additions & 0 deletions providers/aws-bedrock/meta.llama3-70b-instruct-v1:0.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,8 @@ modalities:
mode: chat
model: meta.llama3-70b-instruct-v1:0
provisioning: serverless
removeParams:
- reasoning_effort
sources:
- https://docs.aws.amazon.com/bedrock/latest/userguide/model-card-meta-llama-3-70b-instruct.html
- https://docs.aws.amazon.com/bedrock/latest/userguide/model-parameters-meta.html
Expand Down
9 changes: 9 additions & 0 deletions providers/aws-bedrock/mistral.ministral-3-3b-instruct.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,15 @@ costs:
- input_cost_per_token: 1.2e-7
output_cost_per_token: 1.2e-7
region: eu-south-1
- input_cost_per_token: 1.2e-7
output_cost_per_token: 1.2e-7
region: eu-north-1
- input_cost_per_token: 1.2e-7
output_cost_per_token: 1.2e-7
region: eu-central-1
- input_cost_per_token: 1.2e-7
output_cost_per_token: 1.2e-7
region: ap-southeast-3
- input_cost_per_token: 1.03e-7
output_cost_per_token: 1.03e-7
region: ap-southeast-2
Expand Down
2 changes: 2 additions & 0 deletions providers/aws-bedrock/mistral.ministral-3-8b-instruct.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -70,6 +70,8 @@ params:
maxValue: 1
minValue: 0
provisioning: serverless
removeParams:
- reasoning_effort
sources:
- https://docs.mistral.ai/models/ministral-3-8b-25-12
- https://docs.aws.amazon.com/bedrock/latest/userguide/model-card-mistral-ai-ministral-3-8b.html
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,7 @@ features:
- structured_output
- system_messages
- prompt_caching
- assistant_prefill
limits:
context_window: 256000
max_output_tokens: 32000
Expand Down
2 changes: 2 additions & 0 deletions providers/aws-bedrock/mistral.mistral-small-2402-v1:0.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,8 @@ costs:
region: us-east-1
features:
- function_calling
- tool_choice
- system_messages
limits:
context_window: 32000
max_input_tokens: 32000
Expand Down
9 changes: 9 additions & 0 deletions providers/aws-bedrock/nvidia.nemotron-nano-3-30b.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,15 @@ costs:
- input_cost_per_token: 7e-8
output_cost_per_token: 2.8e-7
region: eu-south-1
- input_cost_per_token: 1.8e-7
output_cost_per_token: 7.8e-7
region: eu-north-1
- input_cost_per_token: 1.8e-7
output_cost_per_token: 7.8e-7
region: eu-central-1
- input_cost_per_token: 1.8e-7
output_cost_per_token: 7.8e-7
region: ap-southeast-3

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nemotron 30B wrong regional rates

Medium Severity

eu-north-1, eu-central-1, and ap-southeast-3 were added with input_cost_per_token 1.8e-7 and output_cost_per_token 7.8e-7, roughly 2.5× this model’s eu-west-1 rates (7e-8 / 2.8e-7). The sibling nvidia.nemotron-nano-12b-v2 YAML uses the same per-token values for those regions as for eu-west-1.

Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit 1018200. Configure here.

- input_cost_per_token: 6.18e-8
output_cost_per_token: 2.472e-7
region: ap-southeast-2
Expand Down
6 changes: 4 additions & 2 deletions providers/aws-bedrock/qwen.qwen3-coder-30b-a3b-v1:0.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -29,8 +29,8 @@ costs:
- input_cost_per_token: 1.6e-7
output_cost_per_token: 6.2e-7
region: ap-southeast-3
- input_cost_per_token: 1.5e-7
output_cost_per_token: 6e-7
- input_cost_per_token: 1.545e-7
output_cost_per_token: 6.18e-7
region: ap-southeast-2
- input_cost_per_token: 1.8e-7
output_cost_per_token: 7.1e-7
Expand All @@ -57,6 +57,8 @@ params:
maxValue: 16000
minValue: 1
provisioning: serverless
removeParams:
- reasoning_effort
sources:
- https://docs.aws.amazon.com/bedrock/latest/userguide/model-card-qwen-qwen3-coder-30b-a3b-instruct.html
status: active
Expand Down
5 changes: 5 additions & 0 deletions providers/aws-bedrock/qwen.qwen3-coder-480b-a35b-v1:0.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -5,12 +5,15 @@ costs:
- input_cost_per_token: 4.5e-7
output_cost_per_token: 0.0000018
region: us-east-2
- region: us-east-1 # not found in official docs
- region: sa-east-1 # not found in official docs
- input_cost_per_token: 7e-7
output_cost_per_token: 0.00000279
region: eu-west-2
- input_cost_per_token: 4.5e-7
output_cost_per_token: 0.0000018
region: eu-north-1
- region: ap-southeast-4 # not found in official docs

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Region costs missing token rates

Medium Severity

New costs rows for us-east-1, sa-east-1, and ap-southeast-4 list only region (with notes that pricing was not in docs) and omit input_cost_per_token and output_cost_per_token. Region-based cost lookup can treat usage in those regions as free or fail to price requests correctly.

Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit 1018200. Configure here.

- input_cost_per_token: 4.7e-7
output_cost_per_token: 0.00000187
region: ap-southeast-3
Expand Down Expand Up @@ -41,6 +44,8 @@ params:
maxValue: 16000
minValue: 1
provisioning: serverless
removeParams:
- reasoning_effort
sources:
- https://docs.aws.amazon.com/bedrock/latest/userguide/model-card-qwen-qwen3-coder-480b-a35b-instruct.html
status: active
Expand Down
1 change: 0 additions & 1 deletion providers/aws-bedrock/us.amazon.nova-2-lite-v1:0.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -60,7 +60,6 @@ params:
provisioning: serverless
sources:
- https://docs.aws.amazon.com/nova/latest/nova2-userguide/what-is-nova-2.html
- https://docs.aws.amazon.com/nova/latest/nova2-userguide/extended-thinking.html
- https://docs.aws.amazon.com/nova/latest/nova2-userguide/using-multimodal-models.html
- https://docs.aws.amazon.com/bedrock/latest/userguide/inference-profiles-support.html
- https://docs.aws.amazon.com/bedrock/latest/userguide/model-card-amazon-nova-2-lite.html
Expand Down
3 changes: 3 additions & 0 deletions providers/aws-bedrock/us.anthropic.claude-opus-4-6-v1.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -49,10 +49,13 @@ costs:
region: ca-central-1
features:
- function_calling
- parallel_function_calling
- prompt_caching
- tool_choice
- structured_output
- system_messages
- cache_control
- assistant_prefill
limits:
context_window: 1000000
max_input_tokens: 1000000
Expand Down
2 changes: 1 addition & 1 deletion providers/aws-bedrock/us.anthropic.claude-sonnet-4-6.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -60,7 +60,7 @@ limits:
max_input_tokens: 1000000
max_output_tokens: 64000
max_tokens: 64000
tool_use_system_prompt_tokens: 346
tool_use_system_prompt_tokens: 497
modalities:
input:
- text
Expand Down
2 changes: 2 additions & 0 deletions providers/aws-bedrock/us.meta.llama3-1-70b-instruct-v1:0.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,8 @@ params:
maxValue: 4096
minValue: 1
provisioning: serverless
removeParams:
- reasoning_effort
sources:
- https://docs.aws.amazon.com/bedrock/latest/userguide/cross-region-inference-support.html
- https://docs.aws.amazon.com/bedrock/latest/userguide/cross-region-inference.html
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,6 @@ features:
- function_calling
- tool_choice
- system_messages
- json_output

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pixtral US drops json_output

Low Severity

This commit removes the json_output feature from us.mistral.pixtral-large-2502-v1:0 while eu.mistral.pixtral-large-2502-v1:0 still declares json_output. Gateways that gate JSON response behavior on features may disable JSON output for the US inference profile only.

Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit 1018200. Configure here.

limits:
context_window: 128000
max_input_tokens: 128000
Expand All @@ -32,6 +31,8 @@ params:
maxValue: 16384
minValue: 1
provisioning: serverless
removeParams:
- reasoning_effort
sources:
- https://docs.aws.amazon.com/bedrock/latest/userguide/model-parameters-mistral-pixtral-large.html
- https://docs.aws.amazon.com/bedrock/latest/userguide/model-card-mistral-ai-pixtral-large.html
Expand Down
Loading