Commit d6df3aa
committed
chore: update pricing.yaml with comprehensive verified model pricing
Expanded from 70+ to 180 models across 14 providers, organized in
four sections:
SECTION 1 — Direct API Providers:
- OpenAI: GPT-5.x, GPT-4.1, GPT-4o, o-series, embeddings, legacy
- Anthropic: Opus/Sonnet/Haiku 4.5, plus legacy Claude models
- Google Gemini: 3/2.5/2.0/1.5 series, embeddings
- Mistral: Large/Medium/Small, Codestral, Devstral, Pixtral, etc.
- Cohere: Command R+/A/R/R7B, Embed 4
- DeepSeek: V3.2 chat/reasoner (official API)
- Perplexity: Sonar, Sonar Pro, Sonar Reasoning Pro, Deep Research
SECTION 2 — Hyperscaler Providers:
- Azure OpenAI: GPT-5/4.1/4o, o-series (Global Standard pricing)
- AWS Bedrock: Claude (Sonnet/Haiku/Opus), Llama 4/3.x, Amazon
Nova (Premier/Pro/Lite/Micro), Titan, Mistral, DeepSeek, Qwen,
Writer, Kimi, MiniMax, NVIDIA, GPT-OSS, Gemma
- Google Vertex AI: Gemini 3/2.5/2.0 (separate from AI Studio)
- IBM watsonx: Noted but omitted (per-RU pricing not publicly
listed at per-model granularity)
- Oracle OCI: Noted but omitted (per-character pricing model,
exact rates require console login)
SECTION 3 — Hosted Inference Providers:
- Together AI: Llama 4/3.x, DeepSeek R1/V3, Qwen, Mistral,
GPT-OSS, Kimi K2, GLM
- Fireworks AI: DeepSeek V3/R1, Qwen, Kimi, GLM, GPT-OSS
- Groq: Llama 4/3.x, Qwen, GPT-OSS, Kimi K2
SECTION 4 — Convenience aliases for unprefixed Llama/DeepSeek.
All prices verified from official sources on 2026-02-01.
Cache/batch pricing noted as comments throughout.1 parent b9bbcf6 commit d6df3aa
1 file changed
Lines changed: 597 additions & 33 deletions
0 commit comments