Curated list of free LLM APIs, coding copilots, AI IDEs, agents, and infrastructure tools for building real AI applications.
- β Free GPT-5 / Claude / Gemini API access
- π€ Coding copilots and AI-native IDEs (Cursor, Trae, Windsurf)
- π° Cheapest AI APIs ($0.10-0.50 per 1M tokens)
- π RAG stack tools (vector DBs, embeddings, frameworks)
- π― Agent frameworks and automation tools
- π Local models for privacy (Ollama, Llama, Qwen)
- ποΈ Production-ready stack configurations
Goal: Help developers build AI apps without paying $200/month.
Note
Please don't abuse these services, else we might lose them for everyone. The numebr becomes 550+ when you add all the models and sub services of all the tools provided. When raising issues or pull requests please dont add your own paid,expensive personal projets.
Warning
April 2026 Model Tier Changes: Major providers (OpenAI, Anthropic, Google) have restricted flagship models (GPT-5.4, Claude Opus 4.6, Gemini 3.1 Pro) to paid tiers. Free tiers now get lighter models (GPT-4o, Claude Sonnet/Haiku, Gemini Flash). Entries marked with [verify] need confirmation.
Most AI tool lists are:
- β Outdated (prices/limits from 2023)
- β Filled with affiliate links and sponsored placements
- β General-purpose directories with no developer focus
- β Missing production-critical details (rate limits, commercial use, architecture patterns)
This repo focuses only on:
- β Tools developers actually use in production
- β Generous free tiers (no "5 requests then paywall")
- β Production-capable models (SWE-bench verified, not toys)
- β Real infrastructure (APIs, hosting, vector DBs, not just chatbots)
- β Minimal fluff, maximum utility
Unlike: awesome-ai (general list), ai-collection (marketing focus), toolify (affiliate-heavy)
This is for: Builders who want to ship AI features this week.
If this repo helped you build something or saved you money:
β Star this repo β it helps more builders discover free AI resources.
[π Share with your team] β spread the knowledge.
π Contribute β found a new free tier? Updated pricing? PRs welcome!
2026-05-18
- β¨ added github PR review tools
2026-04-12
- β¨ added a website for easy navigation
2026-04-11
- β¨ Initial release
- Quick Comparison
- Free LLM API Providers
- AI-Powered IDEs
- CLI Coding Tools
- API Providers for AI Coding Tools
- Paid Tiers Comparison
- Local Models
- free-coding-models CLI
- Additional 2026 AI Tools
- ποΈ Recommended Stacks
- β‘ Realtime & Streaming APIs
- ποΈ Speech Models
- π¨ Image Generation Models
- π¬ Video Generation APIs
- π AI Browser Automation
- πΎ Cheap Vector DB Hosting
- ποΈ Common AI Architecture Patterns
- π΅ Model Price Comparison
- π― Best Models by Use Case
- β±οΈ Rate Limit Comparison
- β Commercial Use Summary
- π§© RAG Stack Tools
- π’ Best Free Embedding APIs
- π₯οΈ AI Hosting & GPU Providers
- π AI Evaluation Tools
- π Structured Output Tools
- π·οΈ Legend
- Contributing
- License
| Provider | Models | Free Tier | Credit Card |
|---|---|---|---|
| NVIDIA NIM | 46 | 40 req/min | No |
| OpenRouter | 25 | 50/day (1K/day with $10) | No |
| Groq | 20+ | 1K-14.4K req/day | No |
| Google AI Studio | 9 | 5-500 req/day | No |
| Cloudflare Workers AI | 47+ | 10K neurons/day | No |
| Cerebras | 4 | 1M tokens/day | No |
| Cohere | 14 | 1K req/month | No |
| Mistral La Plateforme | 10+ | 1B tokens/month | No |
| GitHub Models | 30+ | 50 chat + 2K completions/month | No |
| SambaNova | 13 | $5 for 3 months | No |
| Hyperbolic | 13 | $1 trial | No |
| IDE | Pro-grade Models | Free Tier Limit | Credit Card |
|---|---|---|---|
| Cursor | GPT-5.1-Codex-Max | Limited free tier | No |
| Trae | DeepSeek V4, GPT-4.1 (Claude removed Nov 2025) | 10 fast + 50 slow/month | No |
| Windsurf | OpenAI, Anthropic, Google, xAI | 25 credits/month | Required |
| Qoder | Qwen3.6-Plus, Qwen3-Coder-480B, Claude, GPT, Gemini | Unlimited completions + limited chat | No |
| Tool | Starting Price | Free Tier | Features | Credit Card |
|---|---|---|---|---|
| PrixAI | Free / $10 paid plan | Free trial available | Unlimited reviews Auto-fix PRs, issue planning | No |
| Bito | Free / $25 paid plans | Free trial available | AI PR reviews/Unlimited reviews | No |
| Sourcery | ~$12/month | Free trial available | Code quality reviews | No |
| Tool | Pro-grade Models | Free Tier Limit | Credit Card |
|---|---|---|---|
| Gemini CLI | Gemini 3.1 Flash [verify: Pro paid] | 100-250 req/day | No |
| Rovo Dev CLI | Claude Sonnet 4 [verify], GPT-5 preview [verify] | 5M tokens/day | No |
| Warp | GPT-4.1, Claude Opus 4.1 [verify] | 150 credits/month | No |
| GitHub Copilot | GPT-4.1, Claude Opus | 50 chat + 2K completions/month | No |
| Jules | Gemini 2.5 Pro | 15 tasks/day | No |
| AWS Kiro | Claude Sonnet 4 [verify] | 50 credits/month | No |
| OpenCode | 300+ models via OpenRouter | Zen Free tier | No |
| ForgeCode | 300+ models via OpenRouter | 10K tokens/day | No |
| Amazon Q Developer | Claude Sonnet 4 [verify] | 50 agentic req/month | Required |
| RooCode | Bring your own keys | Unlimited (BYOK) | No |
| Goose | Bring your own keys | Unlimited (BYOK) | No |
| OhMyPi | Bring your own keys | Unlimited (BYOK) | No |
Models achieving β₯60% on SWE-bench Verified:
| Model | SWE-bench | Provider |
|---|---|---|
| Claude Opus 4.6 | 84.2% | Anthropic |
| GPT-5.4 | 80.1% | OpenAI |
| Claude Sonnet 4.6 | 79.3% | Anthropic |
| Gemini 3.1 Pro | 77.4% | |
| Claude Opus 4.5 | 82.1% | Anthropic |
| GPT-5.1-Codex-Max | 78.3% | OpenAI |
| Qwen3.6-Plus | 71.2% | Alibaba |
| Claude Sonnet 4.5 | 77.8% | Anthropic |
Note:
[verify]indicates scores need verification from official sources. Always check current benchmarks before making decisions.
Ready-made combinations for different use cases. Copy-paste these configurations.
| Layer | Tool | Why |
|---|---|---|
| IDE | Cursor Hobby / Qoder | GPT-5.4 limited credits |
| CLI | Gemini CLI (3.1 Pro) / Rovo | 100-250 req/day, 5M tokens/day |
| API | OpenRouter + Groq | 50 req/day + 14.4K req/day combo |
| Local | Ollama + Qwen3.6-Plus | Unlimited offline |
| Automation | n8n Self-hosted | Unlimited workflows |
| Vector DB | ChromaDB / LanceDB | Free local storage |
Total Cost: $0/month
| Layer | Tool | Speed |
|---|---|---|
| Inference | Groq / Cerebras | 2,000 tokens/sec (Cerebras) |
| Coding | Qwen3.6-Plus via Groq | 1,000 req/day (71.2% SWE) |
| Agent | OpenCode Zen | Big Pickle (72.0%), MiniMax M2.5 (80.2%) |
| Cache | DeepSeek V4 | $0.30/$0.50 per 1M, 90% cache discount |
| Edge | Cloudflare Workers AI | Global CDN |
Best for: Real-time apps, trading bots, live coding assistants
| Layer | Tool | Cost |
|---|---|---|
| IDE | Trae Pro | $10/mo (600 fast, DeepSeek V4/GPT-5.4) |
| API | OpenRouter $10 | 1K req/day + BYOK 1M/month free |
| CLI | Gemini CLI | v0.37.1 (Gemini 3.1 Pro/Flash) |
| Local | Ollama | Free |
| Embeddings | Jina AI | Free tier |
Total Cost: ~$10/month for pro-grade everything
| Layer | Tool | Privacy |
|---|---|---|
| Models | Ollama + Llama 3.3 / Qwen3-Coder | Runs locally |
| IDE | Continue.dev + VS Code | BYO local models |
| CLI | Aider + local Ollama | Git-integrated, offline |
| Chat UI | Open WebUI | Self-hosted ChatGPT alternative |
| Vector DB | ChromaDB / LanceDB | Local embeddings storage |
| Speech | Whisper (local) | Offline transcription |
Best for: Healthcare, legal, finance - any sensitive data
| Component | Tool | Role |
|---|---|---|
| Orchestrator | n8n / Gumloop | Workflow automation |
| Reasoning | DeepSeek R1 / DeepSeek V4 | Complex decision making |
| Execution | Qwen3.6-Plus | Code generation |
| Memory | ChromaDB / Supabase Vector | Long-term context |
| Embeddings | Jina Embeddings v3 (1M tokens/day free) | Semantic search |
| Monitoring | LangSmith | Trace agent steps |
Best for: Autonomous research assistants, code review bots, data processing pipelines
| Component | Tool | Purpose |
|---|---|---|
| Framework | LlamaIndex / LangChain | RAG orchestration |
| Vector DB | ChromaDB / Weaviate / Supabase | Document storage |
| Embeddings | E5-Mistral-7B (best accuracy) | Text vectorization |
| Chunking | LlamaIndex | Smart document splitting |
| Reranking | Cohere Rerank | Improve retrieval accuracy |
| LLM | Claude Sonnet 4.6 (79.3%) / GPT-5.4 | Answer generation |
| Eval | RAGAS | Measure RAG performance |
Best for: ExamAi, legal document analysis, knowledge bases
Limits: 20 RPM, 29 free models (262K context max, March 2026), models share quota
- Llama 3.3 70B β
- NEW: Nemotron 3 Super (262K context)
- NEW: MiniMax M2.5
- NEW: Devstral 2 (Apache 2.0)
- NEW: Gemma 3n family (mobile-optimized)
- qwen/qwen3.6-plus:free β
- Hermes 3 Llama 3.1 405B
- Llama 3.2 3B Instruct
- Mistral Small 3.1 24B
- Full list
Unified API gateway for 100+ LLMs. OpenAI and Anthropic SDK-compatible. China-friendly with Hong Kong direct access (100-300ms latency). No monthly fees, pay per token.
Limits: Not published | 1 free model
- GLM-4.7-Flash (200K context, 128K output, $0/M input, $0/M output)
Data is used for training when used outside UK/CH/EEA/EU.
Rate limits: Tier 1 (default): 250 RPD | Tier 2: Requires $250 spend + 30 days
| Model | Free Tier Limits |
|---|---|
| Gemini 3.1 Pro [verify: now paid] | 250 RPD (Tier 1) |
| Gemini 3 Flash | 1,500 RPD |
| All others | Check console |
Note: Data training outside UK/CH/EEA/EU still applies.
Phone number verification required. Models tend to be context window limited.
Limits: 1K credits signup, up to 5K total, 40 RPM (phone verify required)
- 46+ models including Llama 3.3 70B, Llama 4 Scout, Mistral Large, Qwen3 235B
Free tier requires opting into data training; phone verification required
Limits (per-model): 1 req/s, 500K tokens/min, 1B tokens/month
- Open and Proprietary Mistral models (Mistral Large 3, Small 3.1, etc.)
Limits: 30 RPM, 2K RPD confirmed free
- Codestral (monthly subscription-based, currently free)
Serverless Inference limited to models <10GB (some popular models >10GB supported).
Limits: ~$0.10/month in credits
- Various open models across supported providers
Routes to various supported providers.
Limits: $5/month
AI gateway with curated models. Free models may use data for improvement.
- Big Pickle Stealth (S+, 72.0% SWE-bench)
- MiniMax M2.5 Free (S+, 80.2% SWE-bench)
- MiMo V2 Pro/Omni/Flash Free
- Nemotron 3 Super Free
- GPT 5 Nano
- Trinity Large Preview Free
| Model | Limits |
|---|---|
| GPT-OSS 120B | 30 req/min, 60K tokens/min, 900 req/hour, 1M tokens/day |
| Llama 3.1 8B | Same limits as above |
| Qwen3-235B | Available via API |
| Model | Limits |
|---|---|
| Llama 3.1 8B | 14,400 req/day, 6K tokens/min |
| Llama 3.3 70B | 1,000 req/day, 12K tokens/min |
| Llama 4 Maverick/Scout | 1,000 req/day |
| Whisper Large v3/v3 Turbo | 7,200 audio-sec/min, 2,000 req/day |
| Qwen3-32B | 1,000 req/day, 6K tokens/min |
| Kimi K2 Instruct | 1,000 req/day, 10K tokens/min |
| GPT-OSS 20B/120B | 1,000 req/day, 8K tokens/min |
| And 15+ more |
Limits: 20 RPM, 1K req/month (non-commercial only)
- Command R+ 2026
- c4ai-aya-expanse/vision-32b
- command-a/r/r7b variants
Extremely restrictive input/output token limits.
Limits: Dependent on Copilot subscription tier (Free/Pro/Pro+/Business/Enterprise)
- AI21 Jamba 1.5 Large
- Codestral 25.01
- Cohere Command A, Command R/R+ 08-2024
- DeepSeek-R1, DeepSeek-R1-0528, DeepSeek-V3.2, DeepSeek-V3-0324
- Grok 3, Grok 3 Mini
- Llama 4 Maverick 17B 128E Instruct FP8, Llama 4 Scout 17B 16E Instruct
- Llama-3.2-11B/90B-Vision-Instruct, Llama-3.3-70B-Instruct
- MAI-DS-R1, Meta-Llama-3.1-405B/8B-Instruct
- Ministral 3B, Mistral Medium 3 (25.05), Mistral Small 3.1
- OpenAI GPT-4.1/mini/nano, GPT-4o/mini, GPT-5/mini/nano
- OpenAI gpt-5-chat (preview), o1/o1-mini/o1-preview, o3/o3-mini, o4-mini
- OpenAI Text Embedding 3 (large/small)
- Phi-4, Phi-4-mini-instruct/reasoning, Phi-4-multimodal-instruct, Phi-4-reasoning
Limits: 10,000 neurons/day
- @cf/aisingapore/gemma-sea-lion-v4-27b-it
- @cf/ibm-granite/granite-4.0-h-micro
- @cf/openai/gpt-oss-120b, @cf/openai/gpt-oss-20b
- @cf/qwen/qwen3-30b-a3b-fp8
- @cf/zai-org/glm-4.7-flash
- DeepSeek R1 Distill Qwen 32B
- Deepseek Coder 6.7B Base/Instruct (AWQ)
- Deepseek Math 7B Instruct
- Gemma 2B/3 12B/7B Instruct (LoRA)
- Hermes 2 Pro Mistral 7B
- Llama 2 7B/13B Chat (FP16/INT8/AWQ/LoRA)
- Llama 3 8B Instruct, Llama 3.1 8B Instruct (AWQ/FP8)
- Llama 3.2 1B/3B/11B Vision Instruct
- Llama 3.3 70B Instruct (FP8), Llama 4 Scout Instruct
- Mistral 7B Instruct v0.1/v0.2 (AWQ/LoRA)
- Mistral Small 3.1 24B Instruct
- Qwen 1.5 0.5B/1.8B/7B/14B Chat (AWQ)
- Qwen 2.5 Coder 32B Instruct, Qwen QwQ 32B
- Phi-2, SQLCoder 7B 2
- And more...
| Provider | Credits | Duration | Notes |
|---|---|---|---|
| Fireworks | $1 | Permanent | Various open models |
| Baseten | $30 | Permanent | Pay by compute time |
| Nebius | $1 | Permanent | Various open models |
| Novita | $0.50 | 1 year | Various open models |
| AI21 | $10 | 3 months | Jamba family |
| Upstage | $10 | 3 months | Solar Pro/Mini |
| NLP Cloud | $15 | Permanent | Phone verification required |
| Alibaba Cloud | 1M tokens/model | 90 days | Qwen models |
| Modal | $5-30/month | Monthly | Pay by compute time |
| Inference.net | $1 (+$25 on survey) | Permanent | Various open models |
| Hyperbolic | $1 | Permanent | DeepSeek, Llama, Qwen, GPT-OSS |
| SambaNova Cloud | $5 | 3 months | Llama, Qwen, DeepSeek |
| Scaleway | 1M tokens | Permanent | DeepSeek, Llama, Mistral, Gemma |
| Provider | Models | Free Tier | Environment Variable |
|---|---|---|---|
| Together AI | 19 | Credits/promos vary by account | TOGETHER_API_KEY |
| iFlow | 11 | Free for individuals (7-day key expiry) | IFLOW_API_KEY |
| ZAI | 7 | Free tier (generous quota) | ZAI_API_KEY |
| SiliconFlow | 6 | 1K RPM, 50K TPM | SILICONFLOW_API_KEY |
| Perplexity API | 4 | ~50 RPM default | PERPLEXITY_API_KEY |
| OVHcloud AI Endpoints | 8 | 2 req/min (no key), 400 RPM with key | OVH_AI_ENDPOINTS_ACCESS_TOKEN |
| Chutes AI | 4 | Free community GPU-powered | CHUTES_API_KEY |
| DeepInfra | 4 | 200 concurrent requests | DEEPINFRA_API_KEY |
| Replicate | 2 | 6 req/min (no payment), up to 3K RPM with payment | REPLICATE_API_TOKEN |
Full-featured integrated development environments with built-in AI assistance.
Model: GPT-5.1-Codex-Max (77.9% SWE-bench Verified) [verify]
- Free tier: 500 slow premium req/mo, 2K completions/mo (post-Dec 2025 credits)
- Free models: Cursor Small, Deepseek v3, Gemini 2.5 Flash, GPT-4o mini (500/day limit), Grok 3 Mini Beta [verify: GPT-5.4 now paid-only]
- Claude models removed from free tier ~June 2025
- Free tier uses token-based usage tracking (not request-based)
- AI-powered code editor with autonomous coding capabilities
- Pro ($20/mo or $16/mo annually): Extended Agent limits + Unlimited Tab completions + Background Agents + Maximum context windows
- Pro+ ($60/mo): 3x usage on all OpenAI, Claude, Gemini models
- Ultra ($200/mo): 20x usage on all models + Priority access to new features
- Teams ($40/user/mo): Pro features + Centralized billing + Usage analytics + SAML/OIDC SSO
- Enterprise (Custom): Everything in Teams + Pooled usage + SCIM + AI code tracking API + Audit logs
Pricing | GPT-5.1-Codex-Max Announcement
Models: DeepSeek V4, GPT-4.1, GPT-4o, Gemini 2.5 Pro (Claude models removed Nov 2025)
- 10 fast requests + 50 slow requests/month for premium models
- 1,000 slow requests/month for advanced models
- 5,000 auto-completions/month
- VS Code-based IDE with AI integration
- No credit card required for free tier
- Pro ($10/mo): 600 fast + unlimited slow requests for premium models
- Unlimited slow requests for advanced models
- Zero rate limits and faster access to premium models
- Extra packages available: $3-$12 for additional fast requests
- First month available for $3
Models: OpenAI, Anthropic, Google, xAI model access
- 25 prompt credits/month limit
- Multiple providers (OpenAI, Claude, Gemini, xAI)
- Credit card required
- Can purchase add-on credits to continue
- Pro ($15/mo): 500 prompt credits/month
- Teams ($30/user/mo): 500 prompt credits/user/month
- Enterprise ($60+/user/mo): 1,000 prompt credits/user/month
Models: Multi-agent (frontend/backend/testing agents)
- Agent-first IDE - new 2026 category
- Multiple specialized agents coordinate across codebase
- Free preview tier with high usage limits
- VS Code-based
Best for: Full-stack development with natural language direction
Models: Qwen3.6-Plus (71.2% SWE), Qwen-Coder-Qoder, GPT-4o, Claude Sonnet [verify: flagship models now paid-only]
- Free tier: Unlimited completions + limited chat/agent (basic models) + 2-week Pro trial (1,000 credits)
- Experts Mode: Multi-agent collaboration (new Mar 2026)
- Quest Mode: Fully autonomous app building
- Nextnew: Tab predictions
- Windows/macOS, VS Code-based
Pricing (50% discount - Apr 2026):
- Free: Basic models, limited messages
- Pro: $10/mo (reg $20) - 2,000 credits
- Pro+: $30/mo (reg $60) - 6,000 credits
- Ultra: $100/mo (reg $200)
- Credits: $0.01/credit (reg $0.02), expire 1mo
Models: Bring your own API keys (any provider)
- Open-source AI-powered coding assistant for VS Code
- Whole dev team of AI agents in your editor
- No subscription required - pay-as-you-go with your own keys
- Custom modes for different coding tasks
Model: Base model (Llama 3.1 70B), pro-grade models require subscription
- Individual plan: Free forever with unlimited code completions, AI chat, commands
- 70+ programming languages supported
- IDE integrations: VS Code, JetBrains, Vim/Neovim, Jupyter
- No credit card required
- Limited context awareness (expanded in paid tiers)
- Pro ($10/mo): Unlimited usage with advanced context awareness, Claude 3.5 Sonnet, GPT-4o access
- Teams ($12/user/mo): Pro features + team management
- Enterprise (Custom): On-premise deployment, custom models
Models: Local models + cloud models with limited quota
- AI Free tier included with IDEs
- Unlimited code completion and local model support
- Limited quota for cloud-based features
- 30-day AI Pro trial included
- Offline mode with local models via Ollama/LM Studio
- AI Pro ($15/mo): Increased cloud quota + unlimited local models
- AI Ultimate ($25/mo): Maximum cloud quota + advanced features
Models: Claude 3.5 Sonnet, GPT-4o, Llama 3.3 70B, proprietary models
- Free tier with limited features
- Basic AI code completions and chat (limited)
- Local processing available
- Context heavily limited in free tier
- 600+ programming languages supported
- Pro ($12/mo): Enhanced AI completions and chat
- Enterprise ($39/user/mo): Multiple LLMs, private deployment, on-premises and air-gapped options
SuperMaven β οΈ DISCONTINUED
Status: Shut down November 21, 2025 after acquisition by Cursor (Nov 2024)
Models: GPT-4o, Claude 3.5 Sonnet, GPT-4 (via chat interface)
- Free tier with basic features
- Basic code suggestions
- 7-day data retention limit
- Credit card required for registration
- 1M token context window
Historical Note: SuperMaven was acquired by Cursor in November 2024 and officially shut down in November 2025. Features were integrated into Cursor Tab. Users should migrate to Cursor or alternatives.
Models: Unspecified models
- $1 credit/mo = ~100K tokens (reduced Mar 2026)
- Specific model not publicly specified
- Credit card required
- $20/mo: 20M tokens/month
- $200/mo: 200M tokens/month
Models: Unspecified models
- 5 daily credits, max 30 per month (free)
- Models not publicly enumerated
- Credit card required
- Pro ($25/mo): 150 credits/month (5 daily credits)
- Teams ($30/mo): Higher limits (undisclosed)
Models: Proprietary models (not frontier)
- $5 in credits/month limit
- Uses proprietary models with varied routing
- Credit card required
- GPT-5 access requires v0 Premium subscription
General-purpose chat interfaces with free tiers.
| Platform | Free Model | Key Capabilities | Limitations |
|---|---|---|---|
| ChatGPT | GPT-4o / GPT-5.4-limited [verify] | Sora 3, DALL-E 4, GPT Store | ~20 msgs/3hr |
| Gemini | Gemini 3.1 Flash | 2M Context, 20 Deep Research/mo | Research quota |
| Claude | Claude Sonnet/Haiku [verify: Opus paid-only] | Technical reasoning | ~30 msgs/5h |
| Grok | Grok 4.2 | Aurora 2 images, voice | 15 msgs/12hr |
| Mistral Le Chat | Mistral Medium 3 | Structured output | Fewer integrations |
Notes:
- Aurora - xAI's image generation model (available in Grok)
- Sora 2 - OpenAI's video generation (integrated in ChatGPT)
- DALL-E 4 - OpenAI's latest image model (ChatGPT)
- Deep Research - Gemini's agentic research feature
Command-line tools for AI-assisted coding in your terminal.
Models: Gemini 3.1 Flash [verify: Pro now paid], Gemini 2.5 Pro
- Gemini 3.1 Pro latest version (v0.37.1 April 2026)
- 100 requests/day for Gemini 2.5 Pro (free tier fallback)
- 250 requests/day for Gemini 2.5 Flash
- No credit card required for free tier
- MCP server support, Google Search grounding
- Enable via
/settingsβ Preview features β true - Install:
npm install -g @google/gemini-cli
Rate Limits | Pricing | Gemini 3 Pro Announcement
Important
Rovo Dev CLI isnβt available during a Rovo Dev Standard trial. To use this feature, you need a paid Rovo Dev Standard subscription.
Models: Claude Sonnet 4 [verify], GPT-5 preview [verify]
- 5M tokens/day free tier
- No credit card required during beta
- Token limits reset at midnight UTC
- Jira/Confluence integration, MCP server support
- Requires Atlassian account
- Pro ($19.99/mo via Google AI Pro): 100 tasks/day, 5x higher limits, 5x concurrent tasks (15)
- Ultra (via Google AI Ultra): 300 tasks/day, 20x higher limits, 60 concurrent tasks, priority access to latest models
Models: GPT-4.1, Claude Opus 4.1 [verify], Claude Sonnet 4 [verify], Gemini 2.5 Pro
- 150 AI credits/month (first 2 months), then 75 AI credits/month
- No credit card required for basic signup
- AI-powered terminal with code generation
- Build ($20/mo): 1,500 AI credits/month
- Reload Credits available (up to 50% cheaper than old overage rates, roll over for 12 months)
- Bring Your Own API Key (BYOK) option available
- New pricing effective immediately for new customers (Oct 30, 2025)
- Existing monthly subscribers transition on first renewal after Dec 1, 2025
Models: GPT-4.1, Claude Opus 3.5, Gemini 2.0 Flash, Grok Code Fast 1 (Free tier); GPT-5.1-Codex-Max available in Pro/Pro+/Business/Enterprise only
- 50 agent mode or chat requests + 2,000 completions/month (Free tier)
- Agent Mode with autonomous multi-step coding
- No credit card required
- Free Copilot Pro for students/educators (GitHub Student Pack, Copilot Pro for teachers/maintainers)
- Limited to basic features after quota
- Pro ($10/mo): 300 premium requests + unlimited completions/month
- Pro+ ($39/mo): 1,500 premium requests + unlimited completions/month
- Business ($19/user/mo): 300 premium requests/user + unlimited completions
- Enterprise ($39/user/mo): 1,000 premium requests/user + unlimited completions
- GPT-5.1-Codex-Max available in public preview (Dec 4, 2025) for Pro, Pro+, Business, Enterprise - NOT in free tier
- Overage billing available at $0.04/request
Plans Details | Agent Mode | GPT-5.1-Codex-Max Preview
Model: Gemini 2.5 Pro
- 15 tasks/day free tier
- 3 concurrent tasks
- No credit card required
- Gmail account required (18+ years)
- Task limits reset on rolling 24-hour window
- Pro ($19.99/mo): 100 tasks/day, 5x higher limits, 5x concurrent tasks (15)
- Ultra (via Google AI Ultra): 300 tasks/day, 20x higher limits, 60 concurrent tasks, priority access to latest models
Usage Limits | Documentation | Google AI Plans
Models: Claude 4 Sonnet, Claude 3.7 Sonnet (AWS-hosted)
- 50 credits/month (Free tier)
- 14-day welcome bonus: 500 credits
- No credit card required
- Pro ($20/mo): 1,000 credits
- Pro+ ($40/mo): 2,000 credits
- Power ($200/mo): 10,000 credits
Model: Claude Sonnet 4 [verify] (AWS-hosted)
- 50 agentic requests/month limit (multi-turn conversations)
- Latest Claude models
- Credit card required
- Must upgrade to Pro for continued access
- Perpetual free tier
- Pro ($19/mo): Increased limits for agentic requests
- Usage may be adjusted based on regional factors and usage patterns
Models: 300+ via OpenRouter (Claude, GPT, DeepSeek, Gemini, Grok, etc.)
- Open-source AI coding agent (Go-based CLI)
- Zen Free tier with 8 exclusive models (Big Pickle, MiniMax M2.5 Free, MiMo V2)
- Privacy-sensitive: no code/context stored
opencode run --dangerously-skip-permfor quick execution
Models: 300+ models via OpenRouter (Claude, GPT, O Series, Grok, DeepSeek, Gemini)
- AI-enabled pair programmer (Rust-based, Apache 2.0)
- Model-agnostic agent harness
- Semantic codebase search via
:sync - 10K tokens/day free tier
Models: Bring your own keys (any provider)
- AI coding agent for the terminal (Zig-powered)
- Hash-anchored edits, optimized tool harness
- LSP integration, Python support, browser automation
- Subagents with coordinated API rate limiting
- Multiplexer integration (tmux, GNU Screen, Zellij)
- Interrupt anytime workflow
Models: Any LLM (Claude, GPT, DeepSeek, etc.)
- Open-source extensible AI agent from Block (now AAIF/Linux Foundation)
- Desktop app, CLI, and API
- Active engineering tasks (not just code suggestions)
- Built for code, workflows, and automation
- Model-agnostic architecture
Models: Bring your own API keys (Claude, Gemini, GPT, etc.)
- Up to $25 signup credits (one-time bonus)
- Open source VS Code extension
- Pay-as-you-go with no markup on model pricing
- Credit card required to claim full bonus credits
- Full BYOK support
GitHub | Documentation | Pricing
Models: Bring your own API keys (any provider)
- Open-source AI-powered coding assistant for VS Code
- Whole dev team of AI agents in your editor
- No subscription required - pay-as-you-go with your own keys
- Custom modes for different coding tasks
- Previously known as Roo Cline
Models: Claude Sonnet 4 [verify], Opus 4.5 [verify: paid-only], Haiku 4.5
- Free tier available with limited usage
- Pro ($20/mo or $17/mo annually): Sonnet 4 access with more usage
- Max 5x ($100/mo): ~225 messages/5 hours
- Max 20x ($200/mo): ~900 messages/5 hours
- Extended thinking modes: "think" (~4K tokens), "megathink" (~10K), "ultrathink" (~32K)
- Usage limits reset weekly with 5-hour rolling windows
Model: GPT-5.1-Codex-Max (77.9% SWE-bench Verified)
- Free with ChatGPT Plus ($20/mo): 30β150 messages/5 hours
- ChatGPT Pro ($200/mo): 300β1,500 messages/5 hours
- Pay-as-you-go API: $1.25/$10 per million tokens (input/output)
- Free OSS mode: Access to open-source models only (via
--ossflag) - First model with "compaction" for multi-million token sessions (24+ hour tasks)
- 30% fewer thinking tokens than previous GPT-5.1-Codex
- Cross-platform: macOS 12+, Ubuntu 20.04+, Windows 11 via WSL2
GitHub Repo | GPT-5.1-Codex-Max Announcement
Models: Uses Claude Code for implementation
- Autonomous AI development pipeline β #1 Terminal Benchmark 2.0
- Turns GitHub issues into pull requests automatically
- Label an issue "pilot" β Pilot claims it β Creates branch β Plans β Implements β Quality gates β Opens PR
- Telegram bot integration available
- Desktop app available
- Install:
brew install qf-studio/tap/pilotorgo install github.com/qf-studio/pilot@latest
Models: Works with any LLM (Claude, ChatGPT, Cursor, Gemini, local models)
- AI memory system with highest LongMemEval score ever (96.6%)
- Uses ancient "memory palace" technique for AI conversations
- Stores conversations in structured format: wings (people/projects), halls (memory types), rooms (specific ideas)
- Raw verbatim storage without AI summarization
- Three mining modes: projects (code/docs), convos (conversation exports), general (auto-classified)
- MCP server with 19 tools for AI integration
- Local, open, adaptable β runs entirely on your machine
- Install:
pip install mempalace
Models: Bring your own API keys (200+ models supported)
- Free VS Code and JetBrains extension
- Full support for local models via Ollama/LM Studio
- Solo tier: Private/team/public visibility options
- Community hub for custom AI assistants
- No vendor lock-in or usage limits for local models
Models: Bring your own API keys (supports many providers)
- Free command-line assistant with built-in Git integration
- Works with GPT-4o, Claude Sonnet, DeepSeek, and local models
- Multi-file editing with repository context
- Voice-to-code support
- Use
/helpto see all commands
These services provide API access to coding-optimized models for tools like Cursor, Continue.dev, Cline, etc.
- 50 requests/day free tier (1,000/day with $10+ credits)
- Qwen3-Coder-480B, Qwen3-30B-A3B, Qwen3-235B-A22B, Gemini Flash
- 20 req/min rate limit for free tier
- OpenAI-compatible API
- 1.5M tokens/day free tier (expanded Feb 2026)
- 30 req/min, 8,192 token context
- Models: Qwen3.6-Plus-480B, Llama 3.1 70B
- Ultra-fast: 2,400 t/s (Qwen3.6)
- OpenAI-compatible API (works with Cursor, Continue.dev, Cline, RooCode, etc.)
- Paid tiers: Developer ($10+ self-serve), Enterprise (custom pricing)
Pricing | API Docs | Integrations
| IDE | Entry Tier | Credits/Requests | Key Features |
|---|---|---|---|
| Cursor | Pro ($20/mo) | Extended Agent limits | Unlimited completions |
| Trae | Pro ($10/mo) | 600 fast + unlimited slow | Zero rate limits |
| Windsurf | Pro ($20/mo) | 500 prompt credits | Multi-provider |
| Qoder | Pro ($10/mo - 50% off) | 2,000 credits | Quest Mode, Experts Mode |
| Codeium | Pro ($10/mo) | Unlimited | Claude 4.6 [verify], GPT-5.4 [verify] |
| Tabnine | Pro ($12/mo) | Enhanced completions | 600+ languages |
| JetBrains AI | AI Pro ($15/mo) | Increased cloud quota | Unlimited local models |
| Tool | Entry Tier | Credits/Requests | Key Features |
|---|---|---|---|
| Claude Code | Pro ($20/mo) | ~225 messages/5h | Sonnet access [verify] |
| Warp | Build ($20/mo) | 1,500 credits/month | BYOK available |
| GitHub Copilot | Pro ($10/mo) | 300 premium req/month | Unlimited completions |
| Rovo Dev CLI | Jira Standard ($7.53/mo) | 20M tokens/day | 4x free tier |
| Jules | Pro ($19.99/mo) | 100 tasks/day | 5x free limits |
| OpenAI Codex CLI | ChatGPT Plus ($20/mo) | 30-150 msg/5h | GPT-5.1-Codex-Max |
| Amazon Q Developer | Pro ($19/mo) | Increased agentic limits | AWS-hosted Claude |
| Kilo Code | Pay-as-you-go | Up to $25 signup credits | No markup on models |
Running open-weight frontier models locally provides unlimited coding assistance without API costs.
Popular Tools:
- Cline - VS Code extension with Plan/Act modes and MCP support
- Aider - Command-line assistant with Git integration
- Continue.dev - Open-source VS Code extension (200+ models)
Local Model Tools:
- Ollama - Run frontier models locally
- LM Studio - Easy desktop app for local LLMs (no terminal required)
Notable Local Models (2026):
- Qwen3.6-Plus-480B (71.2% SWE, ~150GB VRAM)
- Gemma 4 [verify] (Google, Apache 2.0, fully open-source)
- GLM-5.1 / GLM-5V-Turbo [verify] (Zhipu MoE-based SOTA coders)
- Devstral 2 (24B, Apache 2.0, agent-optimized)
- DeepSeek Coder V4 (lite version ~18GB)
- Codestral 2 (Mistral, 22B)
- GLM-4.9-Air (Chinese/English coding)
Note: Frontier models require substantial RAM/VRAM. See Unsloth Qwen3-Coder guide for details.
Update April 2026: Gemma 4 and GLM-5.1 families are new flagship open-source releases. Verify availability in Ollama/LM Studio before downloading.
Find the fastest free coding model in seconds. Ping 238 models across 25 providers in real-time.
npm install -g free-coding-models
free-coding-models- Parallel pings β all 238 models tested simultaneously
- Stability Score (0-100) β composite score from p95 latency, jitter, spike rate, uptime
- Smart ranking β top 3 highlighted π₯π₯π₯
- Favorites β star models with
F, persisted across sessions - Tool Integration β auto-configure OpenCode, Goose, Aider, Continue, Cline, etc.
- OpenCode Zen Models β 8 exclusive free models (Big Pickle, MiniMax M2.5 Free, MiMo V2, etc.)
# Most reliable model right now
free-coding-models --fiable
# Configure Goose with S-tier model
free-coding-models --goose --tier S
# NVIDIA top models only
free-coding-models --origin nvidia --tier S
# JSON output for scripting
free-coding-models --tier S --json | jq -r '.[0].modelId'| Flag | Launches |
|---|---|
--opencode |
π¦ OpenCode CLI |
--openclaw |
π¦ OpenClaw |
--goose |
πͺΏ Goose |
--aider |
π Aider |
--qwen |
π Qwen Code |
--continue |
|
--cline |
π§ Cline |
--gemini |
β Gemini CLI |
--rovo |
π¦ Rovo Dev CLI |
| And 8 more... |
| Tier | SWE-bench | Best For |
|---|---|---|
| S+ | β₯75% | Claude Opus 4.6 [verify], GPT-5.4 [verify] |
| S | 65-75% | Qwen3.6-Plus (71.2%), Claude Sonnet 4.6 [verify] |
| A+/A | 40β60% | Solid alternatives |
| A-/B+ | 30β40% | Smaller tasks |
| B/C | < 30% | Code completion |
All 238 models allow commercial use of generated output. You own what the models generate.
| License | Models | Commercial |
|---|---|---|
| Apache 2.0 | Qwen3/Qwen2.5 Coder, GPT-OSS 120B/20B, Devstral Small 2, Gemma 4, MiMo V2 Flash | β Unrestricted |
| MIT | GLM 4.5/4.6/4.7/5, MiniMax M2.1, Devstral 2 | β Unrestricted |
| Llama Community License | Llama 3.3 70B, Llama 4 Scout/Maverick | β Attribution required. >700M MAU β separate Meta license |
| DeepSeek License | DeepSeek V3/V3.1/V3.2, R1 | β Use restrictions on model (no military, no harm) β output is yours |
| NVIDIA Nemotron License | Nemotron Super/Ultra/Nano | β Updated Mar 2026, now near-Apache 2.0 permissive |
| MiniMax Model License | MiniMax M2, M2.5 | β Royalty-free, non-exclusive. Prohibited uses policy applies to model |
| Proprietary (API) | Claude (Rovo), Gemini (CLI), Perplexity Sonar, Mistral Large, Codestral | β You own outputs per provider ToS |
| OpenCode Zen | Big Pickle, MiMo V2 Pro/Flash/Omni Free, GPT 5 Nano, MiniMax M2.5 Free, Nemotron 3 Super Free | β Per OpenCode Zen ToS |
Key Points:
- Generated code is yours β no model claims ownership of your output
- Apache 2.0 / MIT models (Qwen, GLM, GPT-OSS, MiMo, Devstral Small) are the most permissive β no strings attached
- Llama requires "Built with Llama" attribution; >700M MAU needs a Meta license
- DeepSeek / MiniMax have use-restriction policies (no military use) that govern the model, not your generated code
- API-served models (Claude, Gemini, Perplexity) grant full output ownership under their terms of service
β οΈ Disclaimer: This is a summary, not legal advice. License terms can change. Always verify the current license on the model's official page before making legal decisions.
- Goal: Compare AI coding tools by their access to pro-grade models and free tier limits.
- What qualifies a model as "pro-grade"? Models must achieve β₯60% on SWE-bench Verified, demonstrating real-world software engineering capability. Current qualifying models: Claude Opus 4.5 (80.9% [verify]), GPT-5.1-Codex-Max (77.9% [verify]), Claude Sonnet 4.5 (77.2% [verify]), Gemini 3 Pro (76.2% [verify]), GPT-5 (74.9% [verify]), Claude Opus 4.1 (74.5% [verify]), Claude Sonnet 4 (72.7% [verify]), GPT-5 mini (71.0% [verify]), Qwen3-Coder-480B (69.6% [verify]), and Gemini 2.5 Pro (63.2% [verify]).
[verify]tag: Indicates information needs verification from official sources. Pricing, limits, and model availability change frequently.- Different limit types: Tools use various quota systems - requests, tokens, credits, chats - making direct comparison challenging. Check documentation for specifics.
- Real-world usage: Actual consumption varies dramatically based on coding style, task complexity, and tool implementation.
| Program | What You Get | Requirements |
|---|---|---|
| GitHub Student Pack | Free Copilot Pro for students | Verify with .edu email |
| GitHub Copilot Free | 50 chat + 2,000 completions/month | VS Code users |
| Copilot Pro for Teachers/Maintainers | Free Copilot Pro | Open source maintainers & educators |
Visual orchestration tools for building autonomous AI agents without coding.
| Platform | Free Tier | Best For | Key Features |
|---|---|---|---|
| Make (Integromat) | 1,000 ops/month | Visual builders | Drag-and-drop AI Agents, 3,000+ app integrations |
| n8n | Unlimited (self-hosted) | Technical teams | Self-hosted RAG systems, private data automation |
| Gumloop | 2,000 credits/month | No-code agents | Natural-language builder, "Gummie" troubleshooting agent |
| Relay.app | Generous free plan | Beginners | Simple agentic workflows |
| Activepieces | 1,000 tasks/month | Open-source | Flat pricing, self-hostable |
| Podium | Entry-level tiers | Sales/communication | 24/7 lead response AI agents |
| QuantFlow Pilot | Free | Autonomous development | #1 Terminal Benchmark 2.0 β AI that ships your tickets |
AI-powered tools for conversational data analysis and narrative visualization.
| Tool | Function | Free Tier Detail | Key Feature |
|---|---|---|---|
| Julius | Chat-with-data | Upload spreadsheets, generate instant visualizations | |
| Anomaly AI | AI Dashboards | Generate interactive dashboards from natural language | |
| Flourish | Data Storytelling | No-code interactive maps, "scrollytelling" features | |
| Datawrapper | Publishing | Publish-ready charts in seconds, journalism-focused | |
| Looker Studio | Marketing Data | Seamless Google Analytics/Ads integration | |
| Power BI Desktop | Microsoft reports | Copilot recommendations, local report building | |
| AI for Database | Natural language DB queries | Freemium - free tier available | Connect any DB (PostgreSQL, MySQL, MongoDB) and query in plain English β no SQL needed, with self-refreshing dashboards and workflow automation |
Professional-grade content creation with generous free tiers.
| Tool | Output | Free Tier | Key Capability |
|---|---|---|---|
| Veo | Video | Basic Free | Cinematic clips with realistic motion and sound |
| Sora 2 (via ChatGPT) | Video | Limited free tier | Deep ChatGPT integration, high-quality video |
| DALL-E 4 (via ChatGPT) | Image | Limited free tier | Latest OpenAI image model |
| Synthesia | Video Avatars | Free individual plan | "Video Agents" in 120+ languages |
| 1 More Shot | Music Videos | Free plan | Advanced lip-sync, frame-by-frame control |
| Leonardo.Ai | Images | 150 tokens/day (~70 images) | Commercial use allowed |
| Recraft AI | Vector/SVG | 30 credits/day | Infinitely scalable icons and logos |
| Ideogram | Images | 10-20 prompts/day | Perfect text rendering, "Magic Prompt" |
| Suno AI | Music | 50 credits/day (~10 tracks) | Complete songs with vocals and instruments |
| ElevenLabs | Voice | Basic Free | Realistic voice cloning |
| Canva AI | Design | Robust free tier | AI design assets, brochures, short videos |
| Tool | Function | Free Tier Detail | Key Feature |
|---|---|---|---|
| Grammarly | Writing | 100 AI prompts/month | Rewrites and tone detection |
| LanguageTool | Grammar | 10,000 characters/text | 25+ languages, open-source |
| Fathom | Meetings | Forever Free | Records/transcribes Zoom/Teams, auto-sync to CRM |
| NotebookLM | Research | Free | Audio Overview podcasts, grounded in your documents |
| Humata | PDF Analysis | 60 pages/month | Clickable source citations |
| QuillBot | Rewriting | 125 words/time | Fluency & Standard modes |
| DeepL | Translation | Basic Free | Incognito sensitive mode |
| MemoryPalace | AI Memory | Free, open source | 96.6% LongMemEval β memory palace technique for AI |
Medical AI:
| Tool | Pricing | Key Value |
|---|---|---|
| iatroX | Free | Adaptive Q-Bank, NICE/BNF clinical reference |
| DxGPT | Free | Diagnostic assistant (500K+ users, 6K doctors) |
| OpenEvidence | Free (US verified) | Evidence-grounded search, ambient note generation |
Legal AI:
| Tool | Pricing | Key Value |
|---|---|---|
| DocLegal.Ai | $10/month | Clause suggestion, risk detection |
| Doculex.ai | Varies | Case-data-driven drafting from medical records |
| Spellbook | 7-day trial | In-editor contract analysis |
| Harvey AI | Enterprise | Regulatory matters, high security |
| Tool | Function |
|---|---|
| Wellows | AI Visibility Score tracking across ChatGPT, Gemini, Perplexity |
| Google SGE Labs | See how AI Overviews interpret target keywords |
| NeuronWriter | AI content scoring |
| Surfer SEO | Content optimization |
| Jasper | AI copywriting with brand voice |
| Writesonic | Scalable copywriting |
| Tool | Function | Description |
|---|---|---|
| Open WebUI | Local Chat Interface | ChatGPT-like experience running entirely offline with Ollama |
| Whisper (OpenAI) | Speech-to-Text | Most accurate open-source transcription |
| Piper | Text-to-Speech | High-quality offline audio generation |
| ComfyUI | Image Generation | Node-based interface for Stable Diffusion |
| Zed | AI IDE | 50 AI prompts/month, native performance, high speed |
| Void IDE | Agent-first IDE | Multi-agent frontend/backend/testing |
| MemoryPalace | AI Memory System | 96.6% LongMemEval β memory palace technique for AI conversations |
Low-latency APIs for voice assistants, live coding copilots, trading tools, and realtime chat.
| Provider | Latency | Best For | Free Tier |
|---|---|---|---|
| Groq Streaming | ~50-150ms (0.4ms/token) | Live coding, chat | 14.4K req/day |
| OpenAI Realtime API | Low | Voice assistants, agents | No free tier (pay-per-use only, trial credits new accounts) |
| Gemini Live API | Low | Multimodal streaming | Dynamic caps (varies by prompt complexity) |
| Cerebras | 2,400 tok/sec (Qwen3.6) | Batch + streaming | 1.5M tokens/day |
| Cloudflare Workers AI | Edge | Global low-latency | 10K neurons/day |
| Provider | Type | Latency | Free Tier |
|---|---|---|---|
| Deepgram | STT streaming | ~300ms | $200 credits |
| AssemblyAI Streaming | Realtime STT | ~400ms | 50 hours/month |
| Groq Whisper | STT fast | ~200ms | 2,000 req/day |
| ElevenLabs Streaming | TTS streaming | ~100ms | 10K chars/month |
| OpenAI Realtime | STT + LLM + TTS | ~200ms | Limited |
Best for:
- Trading bots: Groq streaming (fastest)
- Voice assistants: OpenAI Realtime API (end-to-end)
- Live captions: AssemblyAI or Deepgram
- Realtime chat: Gemini Live API
Speech-to-text and text-to-speech models comparison.
| Model | Provider | Accuracy | Speed | Free Tier | Best For |
|---|---|---|---|---|---|
| Whisper Large v3 | OpenAI/Groq/Local | Excellent | Fast | 2,000 req/day (Groq) | General purpose, local |
| Deepgram Nova | Deepgram | Superior | Very Fast | $200 credits | Production, enterprise |
| AssemblyAI | AssemblyAI | Excellent | Fast | 50 hours/month | Streaming, diarization |
| Whisper API | OpenAI | Excellent | Medium | Pay-per-use | Reliable, consistent |
| Google Speech | Google Cloud | Good | Fast | 60 min/month | Google ecosystem |
| Whisper (local) | OpenAI/Ollama | Excellent | GPU-dependent | Unlimited offline | Privacy, cost control |
| Model | Provider | Quality | Speed | Free Tier | Best For |
|---|---|---|---|---|---|
| ElevenLabs | ElevenLabs | π Best | Fast | 10K chars/month | Voice cloning, pro voice |
| OpenAI TTS | OpenAI | Excellent | Fast | Pay-per-use | Reliable, cheap |
| Piper | Local | Good | Very Fast | Unlimited offline | Privacy, self-hosted |
| Bark | Suno/Local | Good | Medium | Free (local) | Expressive, local |
| Google TTS | Google Cloud | Good | Fast | 1M chars/month | Google ecosystem |
| WhisperSpeech | Local | Good | Fast | Unlimited | Whisper-based TTS |
| API | Input | Output | Latency | Use Case |
|---|---|---|---|---|
| OpenAI Realtime | Audio | Audio | ~200ms | Voice agents |
| Deepgram Voice | Audio | Text/Audio | ~300ms | Voice bots |
| AssemblyAI LeMUR | Audio | LLM response | ~1s | Voice RAG |
Comparison of image generation models and APIs.
| Model | Provider | Quality | Speed | Free Tier | Best For |
|---|---|---|---|---|---|
| FLUX.2 | Black Forest Labs | π Excellent | Fast | Local/Replicate | High quality, open |
| DALL-E 4 | OpenAI | π Best | Medium | ChatGPT Plus | Latest OpenAI |
| Ideogram 2.0 | Ideogram | Excellent | Fast | 20 prompts/day | Text in images |
| Recraft V4 | Recraft | Excellent | Fast | 50 credits/day | Vector/SVG output |
| Stable Diffusion XL | Stability AI | Good | Fast | Local/DreamStudio | Flexibility, local |
| Midjourney v6 | Midjourney | π Excellent | Slow | None (paid only) | Artistic, Discord |
| Leonardo.ai | Leonardo | Very Good | Fast | 150 tokens/day | Commercial use, gaming |
| Adobe Firefly | Adobe | Good | Fast | 25 credits/month | Safe, commercial |
| Imagen 3 | Excellent | Medium | Vertex AI trial | Photorealistic | |
| DiffusionBee | Local | Good | Fast | Local unlimited | Easy setup, open-source |
| ComfyUI | Local | Good | Fast | Local unlimited | Advanced, node-based |
| Provider | Model | Free Tier | Notes |
|---|---|---|---|
| Replicate | FLUX.1-schnell | Free tier | Fast inference |
| Pollinations | Various | Unlimited | No signup |
| HuggingFace | SDXL/FLUX | $0.10 credits | Inference API |
| Leonardo | Phoenix | 150 tokens/day | Commercial OK |
Text-to-video and image-to-video generation. Hot area in 2026.
| Model | Provider | Quality | Duration | Free Tier | Best For |
|---|---|---|---|---|---|
| Veo 3 | π Excellent | 1080p, 60s clips | Limited preview | Cinematic, realistic | |
| Sora 3 | OpenAI | π Excellent | 120s | ChatGPT Plus | High quality, physics |
| Runway Gen-3 | Runway | Excellent | 10 seconds | 3 free credits | Creative, filmmaking |
| Pika 3.0 | Pika | Very Good | 3-5 seconds | Free tier | Lip-sync improved |
| Luma Dream Machine | Luma | Very Good | 5 seconds | 30 generations/mo | Fast, realistic |
| Kling | Kuaishou | Excellent | 2-10 minutes | Limited | Long-form, Chinese |
| Hailuo AI | MiniMax | Good | 6 seconds | Free tier | Character consistency |
| Stable Video Diffusion | Stability | Good | 4 seconds | Local | Open, flexible |
| Provider | Cost per video | Generation time |
|---|---|---|
| Runway | ~$0.20-0.50 | 1-5 min |
| Pika | ~$0.10-0.30 | 30s-2 min |
| Luma | ~$0.30-0.60 | 2-5 min |
| Kling | ~$0.05-0.20 | 1-10 min |
Tools for AI agents to control browsers - web scraping, form filling, testing.
| Tool | Type | Pricing | Best For |
|---|---|---|---|
| Browserbase | Managed browsers | $5 free tier | Production agents |
| Steel.dev | Browser API | Free tier | AI-native browser control |
| Stagehand | AI browser framework | Open source | Next-gen Playwright |
| Playwright | Browser automation | Free | Reliable, well-documented |
| Puppeteer | Chrome automation | Free | Chrome-specific |
| Selenium | Cross-browser | Free | Legacy support |
| Scrapy | Web scraping | Free | Data extraction |
| Tool | AI Integration | Use Case |
|---|---|---|
| Stagehand | Natural language commands | AI agents controlling browsers |
| Browserbase | Session recording for AI | Training agent trajectories |
| Steel.dev | Built for LLM agents | Agent-native browser API |
Stack Recommendation:
- AI agents: Stagehand + Browserbase
- Web scraping: Playwright + Scrapy
- Testing: Playwright + AI assertions
Production-ready vector storage without high costs.
| Provider | Type | Free Tier | Paid | Best For |
|---|---|---|---|---|
| Supabase Vector | Postgres + pgvector | 500MB | $25/mo starter | Full-stack apps |
| Neon | Serverless Postgres | 500MB | $19/mo | Serverless, branching |
| Railway | Managed Postgres | $5 credits | Usage-based | Easy deployment |
| PlanetScale | MySQL + vectors | 5GB | $39/mo | Scale, branching |
| Chroma Cloud | Vector-native | Free tier | Usage-based | Pure vector workloads |
| Qdrant Cloud | Vector DB | 1GB | $25/mo | High performance |
| Pinecone | Managed vector | 2GB | $70/mo | Production, no ops |
| Weaviate Cloud | Vector DB | 5M vectors | $25/mo | Hybrid search |
| LanceDB | Embedded/Cloud | Free | Cloud beta | Multimodal |
| Database | Best For | Notes |
|---|---|---|
| ChromaDB | Prototyping | Simple, Python-native |
| Qdrant | Production | Rust-based, fast |
| Milvus | Enterprise | Scalable, complex |
| pgvector | Postgres apps | Just add extension |
| LanceDB | Embedded | No server needed |
Recommendation by Stage:
- MVP: ChromaDB (local) β Supabase (hosted)
- Production: Qdrant Cloud or Pinecone
- Enterprise: Milvus or Weaviate
Proven patterns for building AI applications.
User β Chat UI β LLM API β Response
β
Context Memory (Redis/Postgres)
Stack:
- Frontend: Next.js + Vercel AI SDK
- Backend: FastAPI + OpenRouter
- Memory: Upstash Redis or Supabase
Documents β Chunking β Embeddings β Vector DB
β
User Query β Embedding β Similarity Search β LLM β Response
Stack:
- Framework: LlamaIndex or LangChain
- Embeddings: BGE-Large or Jina v3
- Vector DB: ChromaDB (dev) β Pinecone (prod)
- LLM: Claude Sonnet [verify] or GPT-4o
User Request β Agent Controller β Tool 1 (Search)
β Tool 2 (Code exec)
β Tool 3 (API call)
β
Synthesize β Response
Stack:
- Framework: LangGraph, AutoGen, or CrewAI
- Tools: Function calling with Claude/GPT-4
- Memory: Vector DB + State management
- Monitoring: LangSmith or Arize
User Request β Router (classify intent)
β
βββββββββββββββββΌββββββββββββββββ
β β β
Cheap Model Medium Model Expensive Model
(GPT-5 Nano) (Claude Sonnet [verify]) (Claude Opus [verify])
β β β
Simple Q&A Complex task Hard reasoning
Implementation:
- Router: Fine-tuned classifier or LLM-based
- Cost optimization: Route 80% to cheap models
- Fallback: Escalate if cheap model fails
Audio Input β STT β LLM β TTS β Audio Output
β β β β
Deepgram Groq Claude ElevenLabs
Stack:
- STT: Deepgram or Whisper Streaming
- LLM: Groq for speed or OpenAI Realtime
- TTS: ElevenLabs or OpenAI TTS
- Latency target: <500ms end-to-end
Image Input β Vision LLM β Structured Output
β
Database / Action
Stack:
- Vision: GPT-4o Vision or Gemini 2.5 Pro
- Structured output: Instructor + Pydantic
- Storage: Postgres JSONB or MongoDB
Text Prompt β LLM Enhancement β Image Gen β Upscaling
β
Video Gen (optional)
Stack:
- Enhancement: GPT-4 or Claude
- Image: FLUX or DALL-E 3
- Upscale: Upscayl or Magnific
- Video: Runway or Pika
API pricing for budget planning. Sorted by input cost.
| Model | Provider | Input | Output | Cache Hit | Best For |
|---|---|---|---|---|---|
| MiniMax M2.6 | MiniMax | $0.08 | $0.12 | - | Bulk generation |
| DeepSeek V4 | DeepSeek | $0.28 | $0.55 | $0.03 π― | Coding, cached |
| GLM 4.9 Air | ZAI | $0.35 | $0.75 | - | Chinese/English |
| Gemini 3.1 Flash | $0.30 | $0.90 | - | 2M context | |
| GPT-5 Nano | OpenAI | $0.45 | $1.80 | - | Cheap reasoning |
| Qwen3-Coder | Alibaba | ~$0.60 | ~$1.20 | - | Strong agent tasks |
| Gemini 2.5 Pro | $1.25 | $10.00 | $0.625 | High quality, 1M context | |
| GPT-4.1 | OpenAI | $2.00 | $8.00 | - | General purpose |
| GPT-5.4 | OpenAI | $2.50 | $10.00 | $1.25 | Latest OpenAI model |
| Claude Sonnet 4 | Anthropic | $3.00 | $15.00 | $0.60 | Best coding, reasoning |
| Claude Opus 4.6 | Anthropic | $5.00 | $25.00 | $2.50 | Complex reasoning |
π‘ Pro tip: DeepSeek's 90% cache discount makes it cheapest for repetitive tasks with long prompts.
Don't just use SWE-bench - match models to your specific task.
| Model | Why | Free Tier |
|---|---|---|
| Claude Sonnet 4.6 | 79.3% SWE-bench, excellent at following instructions | 25 msgs/5h (Claude Code) |
| Qwen3.6-Plus | 71.2% SWE-bench, Chinese + English, agent-optimized | 2,000 req/day |
| GPT-5.4 [verify: paid-only] | 80.1% SWE-bench, long context compaction | ChatGPT Plus/Pro |
| DeepSeek V4 | Near-Sonnet performance at 1/10th cost | DeepSeek API |
| Model | Why | Free Tier |
|---|---|---|
| DeepSeek R1 | Specialized reasoning model, math/logic | DeepSeek API |
| Claude Opus 4.6 | 84.2% SWE-bench, best for complex architecture | Claude Code Pro |
| Gemini 3.1 Pro | 77.4% SWE-bench, 2M context for deep analysis | 100 req/day |
| o3-mini / o1 | OpenAI reasoning models, step-by-step | ChatGPT Plus |
| Model | Why | Cost per 1M |
|---|---|---|
| Gemini 2.5 Flash | 1M context, high throughput | ~$0.35/$1.00 |
| GPT-5 Nano | Newest cheap model from OpenAI | $0.50/$2.00 |
| GPT-4o | ChatGPT free tier model, fast | Variable (free tier) |
| GLM 4.5 Air | Good quality, extremely cheap | ~$0.40/$0.80 |
| MiniMax M2.7 | 80.2% SWE-bench, dirt cheap | $0.08/$0.12 |
| Model | Why | Free Tier |
|---|---|---|
| Claude Sonnet 4.6 | Best tool use, reliable agent behavior | Various |
| GPT-5.4 [verify: paid-only] | Compaction for 24+ hour sessions | ChatGPT Plus/Pro |
| Qwen3.6-Plus | Built for agentic workflows | 2,000 req/day |
| Big Pickle (OpenCode) | 72% SWE-bench [verify], agent-optimized | Zen Free tier |
| Model | Why | Free Tier |
|---|---|---|
| Gemini 2.5 Pro Vision | 1M token context for images/video | 20-100 req/day |
| GPT-4o | Best overall vision capabilities | ChatGPT Free |
| Claude 4 Vision | Detailed image analysis | Claude Free tier |
| Qwen2.5 VL | Strong open vision model | Hyperbolic |
| Model | Provider | Free Tier |
|---|---|---|
| Whisper Large v3 | Groq / Local | 2,000 req/day or unlimited local |
| ElevenLabs | ElevenLabs | Basic free tier |
| Piper | Local | Free, offline TTS |
Critical for scaling applications. Plan your architecture.
| Provider | RPM | TPM | Daily | Best For |
|---|---|---|---|---|
| Groq | 30 | Medium | 14,400 | High-throughput apps |
| Cerebras | 30 | 1,000,000 | 14,400 | Batch processing |
| Gemini Studio | 15 | High | 1,500 | Prototyping |
| OpenRouter | 20 | Medium | 50-1,000 | Flexible routing |
| Cloudflare | 300 | 10K neurons | 10K neurons | Edge deployment |
| Groq (varies) | 30-50 | 6K-30K | 1K-14.4K | Model-dependent |
| App Type | Recommended Stack |
|---|---|
| ExamAi (your app) | Cerebras (Qwen3.6-Plus) + Groq |
| AI Reel Generator | Gemini 3.1 Flash (video) + Groq (audio) |
| Trading AI | Groq + local Qwen3.6-Plus |
| Chatbot | OpenRouter + Gemini 3.1 Flash (cheap) |
| Code Review Bot | DeepSeek V4 (cheap) + Claude Sonnet [verify] (quality) |
Quick reference for legal safety.
| Provider | Commercial Use | Notes |
|---|---|---|
| OpenRouter | β Yes | All models |
| Groq | β Yes | All models |
| Gemini API | β Yes | Per Google ToS |
| Cohere | β Yes | 1K req/month free |
| Claude (API) | β Yes | Per Anthropic ToS |
| OpenCode Zen | β Yes | Per Zen ToS |
| DeepSeek | β Yes | No military use restriction |
| Qwen/Alibaba | β Yes | Apache 2.0 models |
| Ollama Local | β Yes | Fully offline |
β οΈ Always verify current ToS - licenses can change.
Build document Q&A systems like ExamAi.
| Tool | Best For | Free Tier |
|---|---|---|
| LlamaIndex | Production RAG | Open source |
| LangChain | Flexibility | Open source |
| Haystack | Enterprise | Open source |
| Vercel AI SDK | Edge RAG | Free tier |
| Database | Type | Free Tier | Best For |
|---|---|---|---|
| ChromaDB | Local | Unlimited | Prototyping, small apps |
| LanceDB | Local/Serverless | Generous | Multimodal, embeddings |
| Weaviate | Cloud/Local | 5M vectors | Production scale |
| Supabase Vector | Postgres | 500MB | Full-stack apps |
| Pinecone | Managed | 2GB (1 pod) | Production, no ops |
| Qdrant | Local/Cloud | 1GB cloud | High performance |
| Tool | Purpose |
|---|---|
| RAGAS | Evaluate retrieval quality |
| LlamaIndex Evals | Built-in RAG metrics |
| Arize Phoenix | Observability |
Essential for RAG - don't overlook these.
| Embedding | Provider | Dimensions | Free Tier | Best For |
|---|---|---|---|---|
| text-embedding-3-small | OpenAI | 1536 | 200K tokens/day | General purpose |
| Jina Embeddings v3 | Jina AI | 1024 | 1M tokens/day | Multilingual |
| BGE-Large-EN-v1.5 | HuggingFace/Local | 1024 | Free | High quality retrieval |
| E5-Mistral-7B | Various | 4096 | Varies | Best accuracy |
| Nomic Embed v1.5 | Nomic | 768 | Free tier | Long context (8K) |
| GTE-Large | Alibaba | 1024 | DashScope free | Chinese + English |
| Model | Size | Speed | Quality |
|---|---|---|---|
| BGE-Small | 33M | Fast | Good |
| MiniLM-L6 | 22M | Very Fast | Basic |
| Nomic Embed | 137M | Fast | Excellent |
Scale beyond free tiers.
| Provider | Type | Pricing | Best For |
|---|---|---|---|
| Modal | Serverless GPU | $5-30/month credits | Batch inference |
| RunPod | GPU Cloud | $0.20-0.50/hr | Training, fine-tuning |
| Vast.ai | Spot GPUs | Cheap spot prices | Budget inference |
| Lambda Labs | GPU Cloud | ~$0.60/hr A100 | Stable workloads |
| Beam.cloud | Serverless | Per request | Spiky traffic |
| Baseten | Model serving | $30 credits | Production models |
| Replicate | Model hosting | 6 req/min free | Quick deployment |
| Platform | Cold Start | Best For |
|---|---|---|
| Modal | Fast | Python functions |
| Beam | Fast | ML models |
| Replicate | Medium | Pre-built models |
| HuggingFace Inference | Medium | HF ecosystem |
Benchmark your models before production.
| Tool | Purpose | Free Tier |
|---|---|---|
| Promptfoo | Prompt testing, red-teaming | Open source |
| LangSmith | Tracing, evals | 5K traces/month |
| RAGAS | RAG evaluation | Open source |
| DeepEval | LLM unit testing | Open source |
| Arize Phoenix | Observability | Generous free tier |
| Weights & Biases | Experiment tracking | Academic free |
Force LLMs to return valid JSON/schemas.
| Tool | Approach | Best For |
|---|---|---|
| Instructor | Pydantic validation | Python apps |
| Guidance | Constrained generation | Complex schemas |
| Outlines | Regex/constrained | Fast inference |
| JSONformer | Structure-aware decoding | Local models |
| Zod + Vercel AI SDK | TypeScript validation | Web apps |
Quick reference for badges used in this guide.
| Badge | Meaning |
|---|---|
| π’ | No credit card required |
| π³ | Credit card required |
| β‘ | Fast inference (low latency) |
| π§ | Strong reasoning capabilities |
| π» | Coding optimized |
| π¦ | Open source / self-hostable |
| π | Privacy focused / local |
| π€ | Agentic capabilities |
| π― | Best value / cheap |
| π | Multilingual support |
[verify] |
Needs verification from official source |
If you spot an error, missing source link, or have updated quota/model information, please open an issue or pull request with a source.
No affiliation with any vendor. All trademarks belong to their owners. Information is for research; accuracy not guaranteed; limits/pricing change frequently.
- cheahjs/free-llm-api-resources (18.4k β) - Comprehensive free LLM API list
- mnfst/awesome-free-llm-apis (2.1k β) - Permanent free LLM API tiers
- inmve/free-ai-coding (648 β) - Pro-grade AI coding tools comparison
- Coding with AI - Practical techniques for coding with LLMs
- nowork-studio/awesome-ai-startups - A curated list of bootstrapped, pre-seed, and angel-funded AI products built by independent founders
This list was compiled and verified using:
- Gemini - For research and discovering new/additional AI tools
- Perplexity - For verifying information accuracy and checking if data is current
- Community repos - All referenced repositories above were used as reference sources
MIT Β© ShaikhWarsi
Last updated: April 11, 2026 β’ PRs/issues welcome