By default, critique/validation is DISABLED to save costs. The estimates below show costs with critique enabled vs disabled.
Default Configuration (Critique Disabled):
- OpenAI (gpt-4o-mini): Scenario generation
- Anthropic (claude-3-5-haiku): Response generation (civilian + first responder)
- Gemini: Not used (critique disabled)
With Critique Enabled:
- OpenAI (gpt-4o-mini): Scenario generation
- Anthropic (claude-3-5-haiku): Response generation (civilian + first responder)
- Gemini (gemini-2.0-flash-exp): Critique/validation (one per response)
For each sample generated:
- 1 scenario call (OpenAI)
- 2 response calls (Anthropic) - one for civilian, one for first responder (parallel)
- 2 critique calls (Gemini) - SKIPPED BY DEFAULT
Total for 2000 samples (default):
- OpenAI: 2,000 calls
- Anthropic: 4,000 calls
- Gemini: 0 calls
Total for 2000 samples (with critique):
- OpenAI: 2,000 calls
- Anthropic: 4,000 calls
- Gemini: 4,000 calls
Per call:
- Input: ~150 tokens (category + prompt template)
- Output: ~75 tokens (2-4 sentence scenario)
For 2000 samples:
- Input: 2,000 × 150 = 300,000 tokens
- Output: 2,000 × 75 = 150,000 tokens
Pricing (as of Jan 2025):
- Input: $0.150 per million tokens
- Output: $0.600 per million tokens
Cost:
- Input: 0.3M × $0.150 = $0.045
- Output: 0.15M × $0.600 = $0.090
- Total OpenAI: ~$0.14
Per call:
- Input: ~200 tokens (role + scenario + prompt template)
- Output: ~650 tokens (structured JSON with facts, uncertainties, analysis, guidance, confidence, quality_score)
For 4000 calls (2000 samples × 2 roles):
- Input: 4,000 × 200 = 800,000 tokens
- Output: 4,000 × 650 = 2,600,000 tokens
Pricing (as of Jan 2025):
- Input: $0.800 per million tokens
- Output: $4.00 per million tokens
Cost:
- Input: 0.8M × $0.800 = $0.64
- Output: 2.6M × $4.00 = $10.40
- Total Anthropic: ~$11.04
Per call (if enabled):
- Input: ~300 tokens (instructions + JSON payload from response)
- Output: ~650 tokens (corrected/validated JSON, similar size to response)
For 4000 calls (one per response):
- Input: 4,000 × 300 = 1,200,000 tokens
- Output: 4,000 × 650 = 2,600,000 tokens
Pricing (as of Jan 2025):
- Input: $75 per million tokens ($0.075 per 1K tokens)
- Output: $300 per million tokens ($0.30 per 1K tokens)
Cost (if enabled):
- Input: 1.2M × $75 = $90.00
- Output: 2.6M × $300 = $780.00
- Total Gemini: ~$870.00
| Provider | Task | Cost |
|---|---|---|
| OpenAI | Scenario Generation (2K calls) | ~$0.14 |
| Anthropic | Response Generation (4K calls) | ~$11.04 |
| Gemini | Not used | $0.00 |
| TOTAL | 2000 samples | ~$11.18 |
| Provider | Task | Cost |
|---|---|---|
| OpenAI | Scenario Generation (2K calls) | ~$0.14 |
| Anthropic | Response Generation (4K calls) | ~$11.04 |
| Gemini | Critique/Validation (4K calls) | ~$870.00 |
| TOTAL | 2000 samples | ~$881.18 |
- Anthropic: 98.7% of total cost
- OpenAI: 1.3% of total cost
- Gemini: 0% (not used)
- Gemini: 98.7% of total cost
- Anthropic: 1.3% of total cost
- OpenAI: <0.1% of total cost
Current default: Critique is disabled by default, saving ~$870 (98.7% reduction).
Why it works:
- Your response prompt already has STRICT RULES for JSON output
- You have a
SampleValidatorthat validates JSON after generation - If validation fails, the pipeline retries the entire sample (up to 3 times)
- Modern LLMs (especially Claude Haiku) are very good at producing valid JSON
Cost: ~$11.18 for 2000 samples
If you want to keep critique but reduce costs, consider using a cheaper model:
- OpenAI gpt-4o-mini: ~$1.74 for critique (vs $870 for Gemini)
- Anthropic claude-3-5-haiku: ~$11.04 for critique (vs $870 for Gemini)
Potential savings: ~$858.96 (97% reduction)
Only critique if initial validation fails:
- First try: Generate response → Validate
- If invalid: Run critique → Validate again
- If still invalid: Retry entire sample
Potential savings: ~$650-800 (only critique ~15-30% of responses that fail validation)
- OpenAI cached inputs: $0.075 per million (50% savings)
- Anthropic cached inputs: $0.080 per million (90% savings)
-
Token estimates are approximate - Actual usage may vary based on:
- Scenario complexity
- Response length
- Prompt variations
-
Pricing may change - Verify current rates at:
-
Free tiers - Check if you have any free tier credits:
- OpenAI: Often provides $5-18 free credits for new accounts
- Anthropic: May offer free tier credits
- Google: May offer free tier for Gemini API
-
Volume discounts - Some providers offer discounts for high-volume usage
If you swap the providers (with critique enabled):
- OpenAI (gpt-4o-mini): Scenario generation + Critique/validation
- Gemini (gemini-2.0-flash-exp): Response generation
- Anthropic: Not used
- Cost: ~$0.14 (same as current)
Per call:
- Input: ~300 tokens (instructions + JSON payload)
- Output: ~650 tokens (corrected JSON)
For 4000 calls:
- Input: 4,000 × 300 = 1,200,000 tokens
- Output: 4,000 × 650 = 2,600,000 tokens
Cost:
- Input: 1.2M × $0.150 = $0.18
- Output: 2.6M × $0.600 = $1.56
- Total OpenAI Critique: ~$1.74
Per call:
- Input: ~200 tokens (role + scenario + prompt template)
- Output: ~650 tokens (structured JSON)
For 4000 calls:
- Input: 4,000 × 200 = 800,000 tokens
- Output: 4,000 × 650 = 2,600,000 tokens
Cost:
- Input: 0.8M × $75 = $60.00
- Output: 2.6M × $300 = $780.00
- Total Gemini Response: ~$840.00
| Provider | Task | Cost |
|---|---|---|
| OpenAI | Scenario (2K) + Critique (4K) | ~$1.88 |
| Gemini | Response Generation (4K) | ~$840.00 |
| Anthropic | Not used | $0.00 |
| TOTAL | 2000 samples | ~$841.88 |
| Configuration | Total Cost | Savings |
|---|---|---|
| Default (Anthropic response, critique disabled) | ~$11.18 | - |
| With Critique (Anthropic response, Gemini critique) | ~$881.18 | -$870.00 |
| Swapped (Gemini response, OpenAI critique) | ~$841.88 | -$830.70 |
| Configuration | Total Cost | Savings vs Default |
|---|---|---|
| Default (Anthropic response, critique disabled) | ~$11.18 | - ⭐ |
| With Critique (Anthropic response, Gemini critique) | ~$881.18 | -$870.00 |
| Swapped (Gemini response, OpenAI critique) | ~$841.88 | -$830.70 |
| Swapped + No Critique (Gemini response, skip critique) | ~$840.14 | -$828.96 |
- OpenAI: ~$0.07
- Anthropic: ~$5.52
- Total: ~$5.59 ⭐
- OpenAI: ~$0.07
- Anthropic: ~$5.52
- Gemini: ~$435.00
- Total: ~$440.59
- OpenAI: ~$0.94 (scenario + critique)
- Gemini: ~$420.00 (response)
- Total: ~$420.94