BioCage is a secure Python sandbox designed for safely executing code generated by Large Language Models (LLMs) or any untrusted Python code. This guide will help you get started quickly.
BioCage requires Docker to be installed and running on your system.
macOS:
# Install Docker Desktop for Mac
# Download from: https://docs.docker.com/desktop/mac/install/Linux:
# Ubuntu/Debian
sudo apt-get update
sudo apt-get install docker.io docker-compose
# Start Docker service
sudo systemctl start docker
sudo systemctl enable docker
# Add your user to docker group (optional, to avoid sudo)
sudo usermod -aG docker $USERWindows:
# Install Docker Desktop for Windows
# Download from: https://docs.docker.com/desktop/windows/install/docker --version
docker infoInstall BioCage using pip:
pip install biocageOr for development:
git clone https://github.com/biocypher/biocage
cd biocage
pip install -e .Let's start with a simple "Hello World" example:
from biocage import BioCageOrchestrator
# Create a sandbox instance
with BioCageOrchestrator() as sandbox:
# Execute Python code safely
result = sandbox.run("print('Hello, BioCage!')")
# Check if execution was successful
if result.success:
print(f"Output: {result.stdout}")
else:
print(f"Error: {result.stderr}")Expected Output:
Output: Hello, BioCage!
💡 Tip: The context manager (
withstatement) automatically handles container lifecycle, ensuring proper cleanup.
The BioCageOrchestrator is your main interface to BioCage. It manages Docker containers, executes code, and handles resources.
from biocage import BioCageOrchestrator
# Default configuration
sandbox = BioCageOrchestrator()
# Custom configuration
sandbox = BioCageOrchestrator(
memory_limit="1g", # 1GB memory limit
cpu_limit="2.0", # 2 CPU cores
execution_mode="persistent" # Keep state between runs
)Every code execution returns a SandboxExecutionResult object:
result = sandbox.run("x = 42; print(x)")
print(f"Success: {result.success}") # True if no errors
print(f"Exit code: {result.exit_code}") # 0 for success
print(f"Output: {result.stdout}") # Standard output
print(f"Errors: {result.stderr}") # Error messages
print(f"Time: {result.execution_time}") # Execution time in secondsBioCage supports two execution modes:
Ephemeral Mode (default for simple operations):
- Fresh container for each execution
- No state persistence
- Maximum isolation
- Ideal for one-off executions
Persistent Mode (default for context managers):
- Same container across executions
- Variables and imports persist
- Better performance for multiple executions
- Ideal for interactive workflows
# Ephemeral mode - no state persistence
sandbox = BioCageOrchestrator(execution_mode="ephemeral")
sandbox.run("x = 42")
result = sandbox.run("print(x)") # Error: x is not defined
# Persistent mode - state persists
sandbox = BioCageOrchestrator(execution_mode="persistent")
sandbox.run("x = 42")
result = sandbox.run("print(x)") # Output: 42Always use the context manager for automatic resource cleanup:
with BioCageOrchestrator() as sandbox:
result1 = sandbox.run("import numpy as np")
result2 = sandbox.run("arr = np.array([1, 2, 3])")
result3 = sandbox.run("print(arr.mean())")
# Container automatically cleaned upFor more control over container lifecycle:
sandbox = BioCageOrchestrator()
try:
sandbox.start_container()
result1 = sandbox.run("x = 10")
result2 = sandbox.run("y = 20")
result3 = sandbox.run("print(x + y)")
finally:
sandbox.cleanup() # Always cleanupHandle execution errors gracefully:
with BioCageOrchestrator() as sandbox:
result = sandbox.run("undefined_variable")
if not result.success:
print(f"Error occurred: {result.stderr}")
print(f"Exit code: {result.exit_code}")
# Handle error appropriately
else:
print(f"Success: {result.stdout}")BioCage provides several layers of security:
- Container Isolation: Code runs in isolated Docker containers
- No Network Access: Network is disabled by default (configurable)
- Resource Limits: Memory and CPU usage is controlled
- Read-only Filesystem: Root filesystem is read-only
- Execution Timeouts: Prevents infinite loops
# Configure security settings
with BioCageOrchestrator(
memory_limit="512m", # Limit memory to 512MB
cpu_limit="1.0", # Limit to 1 CPU core
network_access=False # Disable network (default)
) as sandbox:
# This code is safely isolated
result = sandbox.run("potentially_dangerous_code()")Now that you understand the basics, explore these guides:
- User Guide - Comprehensive features and usage
- API Reference - Complete API documentation
- Examples - Practical examples and use cases
- Advanced Guide - Performance and complex workflows
- Security - Security model and best practices
Here are some common tasks to get you started:
with BioCageOrchestrator() as sandbox:
# Create sample data
sandbox.run("""
import pandas as pd
data = {'name': ['Alice', 'Bob', 'Charlie'], 'age': [25, 30, 35]}
df = pd.DataFrame(data)
""")
# Analyze data
result = sandbox.run("print(df.describe())")
print(result.stdout)with BioCageOrchestrator() as sandbox:
# Expose a file to the sandbox
sandbox.expose_file("/path/to/data.csv", "/app/data.csv")
# Process the file
result = sandbox.run("""
import pandas as pd
df = pd.read_csv('/app/data.csv')
print(f"Dataset has {len(df)} rows")
""")
print(result.stdout)with BioCageOrchestrator() as sandbox:
# Try potentially failing code
result = sandbox.run("risky_operation()")
if not result.success:
# Try alternative approach
result = sandbox.run("safe_alternative()")BioCage can automatically detect third-party dependencies and build custom Docker images:
# Enable automatic dependency detection
with BioCageOrchestrator(auto_detect_dependencies=True) as sandbox:
# BioCage detects pandas, numpy, matplotlib and builds image
result = sandbox.run("""
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
# Create and analyze data
data = pd.DataFrame({
'x': np.random.randn(100),
'y': np.random.randn(100)
})
print(f"Generated {len(data)} data points")
""")
print(result.stdout)Features:
- 🎯 Automatic import detection from Python code
- 🐳 Dynamic Docker image generation with UV package manager
- ⚡ Intelligent caching for performance
- 📊 Support for data science libraries (pandas, numpy, matplotlib, etc.)
- 💾 Compatible with both persistent and ephemeral modes
💡 Tip: Smart dependency detection eliminates manual Docker image management while maintaining security and performance.
Happy coding with BioCage! 🚀