Provide direct, transparent HTTP API bindings for major LLM providers without abstraction layers or automatic behaviors.
A collection of thin API clients that developers can confidently use, knowing exactly what HTTP calls are being made and having complete control over all operations.
- Direct API Bindings - HTTP clients for LLM provider APIs
- Enterprise Features - Optional reliability features (retry, circuit breaker, rate limiting, caching, etc.)
- Workspace Secrets - Local API key management for development
- Comprehensive Testing - Real API integration tests with zero-tolerance policy
- Provider Abstraction - No unified interface across providers
- Provider Switching - No automatic fallback or routing logic
- Service Layer - No proxy services or aggregation layers
- Application Modules - No CLI tools or high-level applications
All API bindings follow these principles:
- API Transparency - Every method maps directly to an API endpoint
- Zero Client Intelligence - No automatic decision-making
- Explicit Control - Developers control all operations
- Information vs Action - Clear separation of concerns
Allowed: Runtime-Stateful, Process-Stateless
- Connection pools, circuit breaker state, rate limiting buckets
- Retry logic state, failover state, health check state
- Runtime state that dies with the process
- No persistent storage or cross-process state
Prohibited: Process-Persistent State
- File storage, databases, configuration accumulation
- State that survives process restarts
All enterprise features must be:
- Feature-gated behind cargo features
- Explicitly configured (no automatic enabling)
- Transparently named (e.g.,
execute_with_retries()) - Zero overhead when disabled
Available features:
retry- Exponential backoff retry logiccircuit_breaker- Failure threshold managementrate_limiting- Request throttlingrequest_caching- TTL-based response cachingfailover- Multi-endpoint supporthealth_checks- Endpoint monitoringstreaming_control- Pause/resume/cancel streamingcount_tokens- Token counting before API callsaudio_processing- Speech-to-text and text-to-speechbatch_operations- Multiple request optimizationsafety_settings- Content filtering and harm prevention
Anthropic Claude API client with support for:
- Chat completion with streaming
- Prompt caching for system prompts and message history
- Tool calling and function invocation
- Vision support for image inputs
- Token counting
Default Model: claude-sonnet-4-5-20250929
Features:
full- All features enabledstreaming- Streaming responsestool_calling- Function calling supportvision_support- Image processingcached_content- Prompt cachingcount_tokens- Token countingsync_api- Blocking API wrappers
Google Gemini API client with support for:
- Chat completion with streaming
- Content caching for system instructions
- Function calling and tool use
- Vision and multimodal inputs
- File management and uploads
- Code execution
- Audio processing
- Model tuning
Default Model: gemini-2.0-flash-exp
Features:
full- All features enabledstreaming- Streaming responsestool_calling- Function calling supportvision_support- Multimodal inputscached_content- Content cachingcount_tokens- Token countingaudio_processing- Speech-to-text/text-to-speechbatch_operations- Batch request optimizationsync_api- Blocking API wrappers
Hugging Face Inference API client with support for:
- Text generation
- Chat completion
- Embeddings
- Token classification
- Vision tasks
- Audio processing
- Streaming responses
Default Model: meta-llama/Llama-3.3-70B-Instruct
Features:
full- All features enabledstreaming- Streaming responsesembeddings- Embedding generationvision_support- Image processingaudio_processing- Audio taskscount_tokens- Token countingsync_api- Blocking API wrappers
Ollama local LLM runtime API client with support for:
- Chat completion
- Text generation
- Embeddings
- Model management
- Streaming responses
- Vision support
Default Model: llama3.2:latest
Features:
full- All features enabledstreaming- Streaming responsesembeddings- Embedding generationvision_support- Image processingmodel_details- Enhanced model informationcount_tokens- Token countingcached_content- Response cachingsync_api- Blocking API wrappers
OpenAI API client with support for:
- Chat completion with streaming
- Text generation
- Embeddings
- Vision inputs
- Function calling
- Audio processing (Whisper)
- Image generation (DALL-E)
Default Model: gpt-4o
Features:
full- All features enabledstreaming- Streaming responsestool_calling- Function calling supportvision_support- Image processingaudio_processing- Whisper integrationembeddings- Embedding generationcount_tokens- Token countingsync_api- Blocking API wrappers
Shared OpenAI wire-protocol HTTP layer consumed by any OpenAI-compatible API endpoint.
Extracted from api_xai and available for reuse by other crates targeting OpenAI-compatible
providers (KIE.ai, xAI, etc.).
Provides:
- Chat completion request/response wire types
- SSE streaming wire types
- Async HTTP client generic over environment
- Synchronous blocking wrapper
- Environment configuration trait and default implementation
Features:
enabled— activates all public types and the HTTP clientstreaming— Server-Sent Events streaming supportsync_api— blocking wrappers around the async clientintegration— real-API integration tests (requires live credentials)full— enablesenabled,streaming, andsync_api
Architecture Notes:
- Thin-client: every method maps to exactly one API endpoint
- Generic over
OpenAiCompatEnvironmentto support multiple providers api_openaiwire types structurally differ (i32 vs u32, Role enum vs String, multimodal content) and are explicitly NOT consolidated; each crate retains its own type system
X.AI Grok API client with support for:
- Chat completion with streaming
- Function calling and tool use
- Model listing
- OpenAI-compatible REST interface
Default Model: grok-beta
Features:
full- All features enabledstreaming- Streaming responses via SSEtool_calling- Function calling supportretry- Exponential backoff retry logiccircuit_breaker- Failure threshold managementrate_limiting- Request throttlingfailover- Multi-endpoint supporthealth_checks- Endpoint health monitoringintegration- Real API integration tests
Architecture Notes:
- OpenAI-compatible API (base URL:
https://api.x.ai/v1) - Simplified feature set compared to full OpenAI API
- Focus on core chat and tool calling capabilities
- Enterprise reliability features available but optional
- No Mocking - All tests use real API implementations
- Loud Failures - Tests fail clearly when APIs unavailable
- No Silent Passes - Integration tests never pass silently
- Real Implementations Only - No stub/mock servers
api/*/tests/
├── integration_tests.rs # Real API integration tests
├── unit_tests.rs # Unit tests for client logic
└── manual/
└── readme.md # Manual testing procedures
# Load API keys
source secret/-secrets.sh
# Run all tests (requires API keys)
cargo test --workspace
# Run specific crate tests
cargo test -p api_openai
# Run with all features
cargo test --workspace --all-featuresAPI keys stored in secret/-secrets.sh:
#!/bin/bash
export OPENAI_API_KEY="sk-..."
export ANTHROPIC_API_KEY="sk-ant-..."
export GEMINI_API_KEY="AIza..."
export HUGGINGFACE_API_KEY="hf_..."
export XAI_API_KEY="xai-..."File is gitignored and never committed.
API keys provided via environment variables in CI configuration.
- Compilation - All crates compile with zero warnings
- Test Coverage - >90% code coverage across all crates
- Integration Tests - All integration tests pass with real APIs
- Documentation - All public APIs documented
- Zero Panics - No unwrap() or expect() in production code paths
- Feature Isolation - All features compile independently
Potential future additions (not currently in scope):
- Additional provider APIs (Cohere, AI21, etc.)
- Async runtime abstraction (support for different executors)
- Custom HTTP client support
- WebSocket streaming for real-time bidirectional communication
- Enhanced observability (tracing, metrics)
Explicitly not goals for this workspace:
- Provider abstraction layer
- Unified interface across providers
- Provider routing or fallback logic
- Service orchestration
- Application frameworks
- CLI tools