An OpenAI API-compatible wrapper for Claude Code, allowing you to use Claude Code with any OpenAI client library. Now powered by the official Claude Code Python SDK with enhanced authentication and features.
π Production Ready - All core features working and tested:
- β Chat completions endpoint with official Claude Code Python SDK
- β Streaming and non-streaming responses
- β Full OpenAI SDK compatibility
- β Multi-provider authentication (API key, Bedrock, Vertex AI, CLI auth)
- β System prompt support via SDK options
- β Model selection support with validation
- β Fast by default - Tools disabled for OpenAI compatibility (5-10x faster)
- β Optional tool usage (Read, Write, Bash, etc.) when explicitly enabled
- β Real-time cost and token tracking from SDK
- β Session continuity with conversation history across requests π
- β Session management endpoints for full session control π
- β Health, auth status, and models endpoints
- β Development mode with auto-reload
- OpenAI-compatible
/v1/chat/completionsendpoint - Support for both streaming and non-streaming responses
- Compatible with OpenAI Python SDK and all OpenAI client libraries
- Automatic model validation and selection
- Official Claude Code Python SDK integration (v0.0.14)
- Real-time cost tracking - actual costs from SDK metadata
- Accurate token counting - input/output tokens from SDK
- Session management - proper session IDs and continuity
- Enhanced error handling with detailed authentication diagnostics
- Automatic detection of authentication method
- Claude CLI auth - works with existing
claude authsetup - Direct API key -
ANTHROPIC_API_KEYenvironment variable - AWS Bedrock - enterprise authentication with AWS credentials
- Google Vertex AI - GCP authentication support
- System prompt support via SDK options
- Optional tool usage - Enable Claude Code tools (Read, Write, Bash, etc.) when needed
- Fast default mode - Tools disabled by default for OpenAI API compatibility
- Development mode with auto-reload (
uvicorn --reload) - Interactive API key protection - Optional security with auto-generated tokens
- Comprehensive logging and debugging capabilities
Get started in under 2 minutes:
# 1. Install Claude Code CLI (if not already installed)
npm install -g @anthropic-ai/claude-code
# 2. Authenticate (choose one method)
claude auth login # Recommended for development
# OR set: export ANTHROPIC_API_KEY=your-api-key
# 3. Clone and setup the wrapper
git clone https://github.com/RichardAtCT/claude-code-openai-wrapper
cd claude-code-openai-wrapper
poetry install
# 4. Start the server
poetry run uvicorn main:app --reload --port 8000
# 5. Test it works
poetry run python test_endpoints.pyπ That's it! Your OpenAI-compatible Claude Code API is running on http://localhost:8000
-
Claude Code CLI: Install Claude Code CLI
# Install Claude Code (follow Anthropic's official guide) npm install -g @anthropic-ai/claude-code -
Authentication: Choose one method:
- Option A: Authenticate via CLI (Recommended for development)
claude auth login
- Option B: Set environment variable
export ANTHROPIC_API_KEY=your-api-key - Option C: Use AWS Bedrock or Google Vertex AI (see Configuration section)
- Option A: Authenticate via CLI (Recommended for development)
-
Python 3.10+: Required for the server
-
Poetry: For dependency management
# Install Poetry (if not already installed) curl -sSL https://install.python-poetry.org | python3 -
-
Clone the repository:
git clone https://github.com/RichardAtCT/claude-code-openai-wrapper cd claude-code-openai-wrapper -
Install dependencies with Poetry:
poetry install
This will create a virtual environment and install all dependencies.
-
Configure environment:
cp .env.example .env # Edit .env with your preferences
Edit the .env file:
# Claude CLI path (usually just "claude")
CLAUDE_CLI_PATH=claude
# Optional API key for client authentication
# If not set, server will prompt for interactive API key protection on startup
# API_KEY=your-optional-api-key
# Server port
PORT=8000
# Timeout in milliseconds
MAX_TIMEOUT=600000
# CORS origins
CORS_ORIGINS=["*"]The server supports interactive API key protection for secure remote access:
-
No API key set: Server prompts "Enable API key protection? (y/N)" on startup
- Choose No (default): Server runs without authentication
- Choose Yes: Server generates and displays a secure API key
-
Environment API key set: Uses the configured
API_KEYwithout prompting
# Example: Interactive protection enabled
poetry run python main.py
# Output:
# ============================================================
# π API Endpoint Security Configuration
# ============================================================
# Would you like to protect your API endpoint with an API key?
# This adds a security layer when accessing your server remotely.
#
# Enable API key protection? (y/N): y
#
# π API Key Generated!
# ============================================================
# API Key: Xf8k2mN9-vLp3qR5_zA7bW1cE4dY6sT0uI
# ============================================================
# π IMPORTANT: Save this key - you'll need it for API calls!
# Example usage:
# curl -H "Authorization: Bearer Xf8k2mN9-vLp3qR5_zA7bW1cE4dY6sT0uI" \
# http://localhost:8000/v1/models
# ============================================================Perfect for:
- π Local development - No authentication needed
- π Remote access - Secure with generated tokens
- π VPN/Tailscale - Add security layer for remote endpoints
-
Verify Claude Code is installed and working:
claude --version claude --print --model claude-3-5-haiku-20241022 "Hello" # Test with fastest model
-
Start the server:
Development mode (recommended - auto-reloads on changes):
poetry run uvicorn main:app --reload --port 8000
Production mode:
poetry run python main.py
Port Options for production mode:
- Default: Uses port 8000 (or PORT from .env)
- If port is in use, automatically finds next available port
- Specify custom port:
poetry run python main.py 9000 - Set in environment:
PORT=9000 poetry run python main.py
# Basic chat completion (no auth)
curl -X POST http://localhost:8000/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "claude-3-5-sonnet-20241022",
"messages": [
{"role": "user", "content": "What is 2 + 2?"}
]
}'
# With API key protection (when enabled)
curl -X POST http://localhost:8000/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer your-generated-api-key" \
-d '{
"model": "claude-3-5-sonnet-20241022",
"messages": [
{"role": "user", "content": "Write a Python hello world script"}
],
"stream": true
}'from openai import OpenAI
# Configure client (automatically detects auth requirements)
client = OpenAI(
base_url="http://localhost:8000/v1",
api_key="your-api-key-if-required" # Only needed if protection enabled
)
# Alternative: Let examples auto-detect authentication
# The wrapper's example files automatically check server auth status
# Basic chat completion
response = client.chat.completions.create(
model="claude-3-5-sonnet-20241022",
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "What files are in the current directory?"}
]
)
print(response.choices[0].message.content)
# Output: Fast response without tool usage (default behavior)
# Enable tools when you need them (e.g., to read files)
response = client.chat.completions.create(
model="claude-3-5-sonnet-20241022",
messages=[
{"role": "user", "content": "What files are in the current directory?"}
],
extra_body={"enable_tools": True} # Enable tools for file access
)
print(response.choices[0].message.content)
# Output: Claude will actually read your directory and list the files!
# Check real costs and tokens
print(f"Cost: ${response.usage.total_tokens * 0.000003:.6f}") # Real cost tracking
print(f"Tokens: {response.usage.total_tokens} ({response.usage.prompt_tokens} + {response.usage.completion_tokens})")
# Streaming
stream = client.chat.completions.create(
model="claude-3-5-sonnet-20241022",
messages=[
{"role": "user", "content": "Explain quantum computing"}
],
stream=True
)
for chunk in stream:
if chunk.choices[0].delta.content:
print(chunk.choices[0].delta.content, end="")claude-sonnet-4-20250514(Recommended)claude-opus-4-20250514claude-3-7-sonnet-20250219claude-3-5-sonnet-20241022claude-3-5-haiku-20241022
The model parameter is passed to Claude Code via the --model flag.
The wrapper now supports session continuity, allowing you to maintain conversation context across multiple requests. This is a powerful feature that goes beyond the standard OpenAI API.
- Stateless Mode (default): Each request is independent, just like the standard OpenAI API
- Session Mode: Include a
session_idto maintain conversation history across requests
import openai
client = openai.OpenAI(
base_url="http://localhost:8000/v1",
api_key="not-needed"
)
# Start a conversation with session continuity
response1 = client.chat.completions.create(
model="claude-3-5-sonnet-20241022",
messages=[
{"role": "user", "content": "Hello! My name is Alice and I'm learning Python."}
],
extra_body={"session_id": "my-learning-session"}
)
# Continue the conversation - Claude remembers the context
response2 = client.chat.completions.create(
model="claude-3-5-sonnet-20241022",
messages=[
{"role": "user", "content": "What's my name and what am I learning?"}
],
extra_body={"session_id": "my-learning-session"} # Same session ID
)
# Claude will remember: "Your name is Alice and you're learning Python."# First message (add -H "Authorization: Bearer your-key" if auth enabled)
curl -X POST http://localhost:8000/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "claude-3-5-sonnet-20241022",
"messages": [{"role": "user", "content": "My favorite color is blue."}],
"session_id": "my-session"
}'
# Follow-up message - context is maintained
curl -X POST http://localhost:8000/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "claude-3-5-sonnet-20241022",
"messages": [{"role": "user", "content": "What's my favorite color?"}],
"session_id": "my-session"
}'The wrapper provides endpoints to manage active sessions:
GET /v1/sessions- List all active sessionsGET /v1/sessions/{session_id}- Get session detailsDELETE /v1/sessions/{session_id}- Delete a sessionGET /v1/sessions/stats- Get session statistics
# List active sessions
curl http://localhost:8000/v1/sessions
# Get session details
curl http://localhost:8000/v1/sessions/my-session
# Delete a session
curl -X DELETE http://localhost:8000/v1/sessions/my-session- Automatic Expiration: Sessions expire after 1 hour of inactivity
- Streaming Support: Session continuity works with both streaming and non-streaming requests
- Memory Persistence: Full conversation history is maintained within the session
- Efficient Storage: Only active sessions are kept in memory
See examples/session_continuity.py for comprehensive Python examples and examples/session_curl_example.sh for curl examples.
POST /v1/chat/completions- OpenAI-compatible chat completions (supportssession_id)GET /v1/models- List available modelsGET /v1/auth/status- Check authentication status and configurationGET /health- Health check endpoint
GET /v1/sessions- List all active sessionsGET /v1/sessions/{session_id}- Get detailed session informationDELETE /v1/sessions/{session_id}- Delete a specific sessionGET /v1/sessions/stats- Get session manager statistics
- Images in messages are converted to text placeholders
- Function calling not supported (tools work automatically based on prompts)
- OpenAI parameters not yet mapped:
temperature,top_p,max_tokens,logit_bias,presence_penalty,frequency_penalty - Multiple responses (
n > 1) not supported
- Tool configuration - allowed/disallowed tools endpoints
- OpenAI parameter mapping - temperature, top_p, max_tokens support
- Enhanced streaming - better chunk handling
- MCP integration - Model Context Protocol server support
- β SDK Integration: Official Python SDK replaces subprocess calls
- β Real Metadata: Accurate costs and token counts from SDK
- β Multi-auth: Support for CLI, API key, Bedrock, and Vertex AI authentication
- β Session IDs: Proper session tracking and management
- β System Prompts: Full support via SDK options
- β Session Continuity: Conversation history across requests with session management
-
Claude CLI not found:
# Check Claude is in PATH which claude # Update CLAUDE_CLI_PATH in .env if needed
-
Authentication errors:
# Test authentication with fastest model claude --print --model claude-3-5-haiku-20241022 "Hello" # If this fails, re-authenticate if needed
-
Timeout errors:
- Increase
MAX_TIMEOUTin.env - Note: Claude Code can take time for complex requests
- Increase
Test all endpoints with a simple script:
# Make sure server is running first
poetry run python test_endpoints.pyRun the comprehensive test suite:
# Make sure server is running first
poetry run python test_basic.py
# With API key protection enabled, set TEST_API_KEY:
TEST_API_KEY=your-generated-key poetry run python test_basic.pyThe test suite automatically detects whether API key protection is enabled and provides helpful guidance for providing the necessary authentication.
Check authentication status:
curl http://localhost:8000/v1/auth/status | python -m json.tool# Install development dependencies
poetry install --with dev
# Format code
poetry run black .
# Run full tests (when implemented)
poetry run pytest tests/All tests should show:
- 4/4 endpoint tests passing
- 4/4 basic tests passing
- Authentication method detected (claude_cli, anthropic, bedrock, or vertex)
- Real cost tracking (e.g., $0.001-0.005 per test call)
- Accurate token counts from SDK metadata
MIT License
Contributions are welcome! Please open an issue or submit a pull request.