Sandbox Environments for AI Agent Workflows

Security model: The sandbox (Docker/E2B) provides the security boundary. Inside the sandbox, Claude runs with full permissions because the container itself is isolated.

Security philosophy:

"It's not if it gets popped, it's when it gets popped. And what is the blast radius?"

Run on dedicated VMs or local Docker sandboxes. Restrict network connectivity, provide only necessary credentials, and ensure no access to private data beyond what the task requires.

Options

Sprites (Fly.io)

Persistent Linux environments that survive between executions indefinitely
Firecracker VM isolation with up to 8 vCPUs and 8GB RAM
Fast checkpoint/restore (~300ms create, <1s restore)
Auto-sleep after 30 seconds of inactivity
Unique HTTPS URL per Sprite for webhooks, APIs, public access
Layer 3 network policies for egress control (whitelist domains or use default LLM-friendly list)
CLI, REST API, JavaScript SDK, Go SDK (Python and Elixir coming soon)
Pre-installed tools: Claude Code, Codex CLI, Gemini CLI, Python 3.13, Node.js 22.20
$30 free credits to start (~500 Sprites worth)

Philosophy: Fly.io argues that "ephemeral sandboxes are obsolete" and that AI agents need persistent computers, not disposable containers. Sprites treat sandboxes as "actual computers" where data, packages, and services persist across executions on ext4 NVMe storage—no need to rebuild environments repeatedly. As they put it: "Claude doesn't want a stateless container."

Unique Features:

Stateful persistence: Files, packages, databases survive between runs indefinitely
Transactional snapshots: Copy-on-write checkpoints capture entire disk state; stores last 5 checkpoints
Idle cost optimization: Auto-sleep when inactive (30s timeout), resume on request (<1s wake)
Cold start: Creation in 1-2 seconds, restore under 1 second
Claude integration: Pre-installed skills teach Claude how to use Sprites (port forwarding, etc.)
Storage billing: Pay only for blocks written, not allocated space; TRIM-friendly
No time limits: Unlike ephemeral sandboxes (typically 15-minute limits), Sprites support long-running workloads

Pricing:

Resource	Cost	Minimum
CPU	$0.07/CPU-hour	6.25% utilization/s
Memory	$0.04375/GB-hour	250MB per second
Storage	$0.00068/GB-hour	Actual blocks only

Free trial: $30 in credits (~500 Sprites)
Plan: $20/month includes monthly credits; overages at published rates
Example costs: 4-hour coding session ~$0.46, web app with 30 active hours ~$4/month

Specs:

Spec	Value
Isolation	Firecracker microVM (hardware-isolated)
Resources	Up to 8 vCPUs, 8GB RAM per execution (fixed, not configurable)
Storage	100GB initial ext4 partition on NVMe, auto-scaling capacity
Cold Start	<1 second restore, 1-2 seconds creation
Timeout	None (persistent); auto-sleeps after 30 seconds inactivity
Active Limit	10 simultaneous active Sprites on base plan; unlimited cold
Network	Port 8080 proxied for HTTP services; isolated networks

Limitations:

Resource caps (8 vCPU, 8GB RAM, 100GB storage) not configurable yet
30-second idle timeout not configurable
Region selection not available (auto-assigned based on geographic location)
Maximum 10 active sprites on base plan (unlimited cold/inactive sprites allowed)
Best for personal/organizational tools; not designed for million-user scale apps

Links:

Official: https://sprites.dev/
Documentation: https://docs.sprites.dev/
Fly.io Blog: https://fly.io/blog/code-and-let-live/
JavaScript SDK: https://github.com/superfly/sprites-js
Go SDK: https://github.com/superfly/sprites-go
Elixir SDK: https://github.com/superfly/sprites-ex
Community: https://community.fly.io/c/sprites/

E2B

Purpose-built for AI agents and LLM workflows
Pre-built template anthropic-claude-code ships with Claude Code CLI ready
Single-line SDK calls in Python or JavaScript (v1.5.1+)
Full filesystem + git for progress.txt, prd.json, and repo operations
24-hour session limits on Pro plan (1 hour on Hobby)
Native access to 200+ MCP tools via Docker partnership (GitHub, Notion, Stripe, etc.)
Configurable compute: 1-8 vCPU, 512MB-8GB RAM

Philosophy: E2B believes AI agents need transient, immutable workloads with hardware-level kernel isolation. Each sandbox runs in its own Firecracker microVM, providing the same isolation as AWS Lambda. The focus is on developer experience—one SDK call to create a sandbox.

Unique Features:

Fastest cold start: ~150-200ms via Firecracker microVMs
Pre-built Claude template: Zero-setup Claude Code integration
Docker MCP Partnership: Native access to 200+ MCP tools from Docker's catalog
Pause/Resume (Beta): Save full VM state including memory (~4s per 1GB to pause, ~1s to resume, state persists up to 30 days)
Network controls: allowInternetAccess toggle, network.allowOut/network.denyOut for granular CIDR/domain filtering
Domain filtering: Works for HTTP (port 80) and TLS (port 443) via SNI inspection

Pricing:

Plan	Monthly Fee	Session Limit	Notes
Hobby	$0	1 hour	+ $100 one-time credit
Pro	$150	24 hours	+ usage costs
Enterprise	Custom	Custom	SSO, SLA, dedicated support

Usage Rates (per second):

Resource	Rate
2 vCPU	$0.000028/s
Memory	$0.0000045/GiB

Specs:

Spec	Value
Isolation	Firecracker microVM
Cold Start	~150-200ms
Timeout	1 hour (Hobby), 24 hours (Pro)
Compute	1-8 vCPU, 512MB-8GB RAM (configurable)
Filesystem	Full Linux with git support
Pre-installed	Node.js, curl, ripgrep, Claude Code

Limitations:

No native sandbox clone/fork functionality
No bulk file reading API
Domain filtering limited to HTTP/HTTPS ports (UDP/QUIC not supported)
Self-hosted version lacks built-in network policies
Occasional 502 timeout errors on long operations
Sandbox "not found" errors near timeout boundaries

Links:

Official: https://e2b.dev/
Documentation: https://e2b.dev/docs
Pricing: https://e2b.dev/pricing
Python Guide: https://e2b.dev/blog/python-guide-run-claude-code-in-an-e2b-sandbox
JavaScript Guide: https://e2b.dev/blog/javascript-guide-run-claude-code-in-an-e2b-sandbox
Claude Code Template: https://e2b.dev/docs/code-interpreter/claude-code
MCP Server: https://github.com/e2b-dev/mcp-server
GitHub: https://github.com/e2b-dev/E2B

Modal

Modal Sandboxes are the Modal primitive for safely running untrusted code from LLMs, users, or third-party sources. Built on Modal's serverless container fabric with gVisor isolation.

Key Features:

Pure Python SDK for defining sandboxes with one line of code (also JS/Go SDKs)
Execute arbitrary commands with sandbox.exec() and stream output
Autoscale from zero to 10,000+ concurrent sandboxes
Dynamic image definition at runtime from model output
Built-in tunneling for HTTP/WebSocket connections to sandbox servers
Granular egress policies via CIDR allowlists
Named sandboxes for persistent reference and pooling
Production-proven: Lovable and Quora run millions of code executions daily

Philosophy: Modal treats sandboxes as secure, ephemeral compute units that inherit its serverless fabric. The focus is on Python-first AI/ML workloads with aggressive cost optimization through scale-to-zero, trading cold start latency for resource efficiency.

Unique Features:

Sandbox Connect Tokens: Authenticated HTTP/WebSocket access with unspoofable X-Verified-User-Data headers for access control
Memory Snapshots: Capture container memory state to reduce cold starts to <3s even with large dependencies like PyTorch
Idle Timeout: Auto-terminate sandboxes after configurable inactivity period
Filesystem Snapshots: Preserve state across sandbox instances for 24+ hour workflows
No pre-provisioning: Sandboxes created on-demand without capacity planning

Pricing (as of late 2025, after 65% price reduction):

Plan	Monthly Fee	Credits Included	Seats	Container Limits
Starter	$0	$30/month	3	100 containers, 10 GPU
Team	$250	$100/month	∞	1,000 containers, 50 GPU
Enterprise	Custom	Volume discounts	∞	Custom limits, HIPAA, SSO, etc.

Compute Rates (per second):

Resource	Rate	Notes
Sandbox/Notebook CPU	$0.00003942/core	Per physical core (= 2 vCPU)
Standard Function CPU	$0.0000131/core	Per physical core
Memory	$0.00000222/GiB	Pay for actual usage
GPU (A10G)	$0.000306/s	~$1.10/hr
GPU (A100 40GB)	$0.000583/s	~$2.10/hr
GPU (H100)	$0.001097/s	~$3.95/hr

Special Credits: Startups up to $25k, Academics up to $10k free compute

Specs:

Spec	Value
Isolation	gVisor (Google's container runtime)
Cold Start	~1s container boot, 2-5s typical with imports
With Snapshots	<3s even with large dependencies
Default Timeout	5 minutes
Max Timeout	24 hours (use snapshots for longer)
Idle Timeout	Configurable auto-termination
Filesystem	Ephemeral (use Volumes for persistence)
Network Default	Secure-by-default, no incoming connections
Egress Control	`block_network=True` or `cidr_allowlist`
Concurrent Scaling	10,000+ sandboxes

Volumes (Persistent Storage):

High-performance distributed filesystem (up to 2.5 GB/s bandwidth)
Volumes v2 (beta): No file count limit, 1 TiB max file size, HIPAA-compliant deletion
Explicit commit() required to persist changes
Last-write-wins for concurrent modifications to same file
Best for model weights, checkpoints, and datasets

Limitations:

Cold start penalties when containers spin down (2-5s typical)
No on-premises deployment option
Sandboxes cannot access other Modal workspace resources by default
Single-language focus (Python-optimized, less suited for multi-language untrusted code)
Volumes require explicit reload to see changes from other containers
Less suited for persistent, long-lived environments vs microVM solutions

Modal vs E2B for AI Agents:

Aspect	Modal	E2B
Isolation	gVisor containers	Firecracker microVMs
Cold Start	2-5s typical, <3s with snapshot	~150ms
Session Duration	Up to 24h (stateless)	Up to 24h (Pro), persistent
Self-Hosting	No (managed only)	Experimental
Multi-Language	Python-focused	Python, JS, Ruby, C++
Network Control	Granular egress policies	Allow/deny lists
Best For	Python ML/AI, batch workloads	Multi-language agent sandboxes

Links:

Sandbox Product: https://modal.com/use-cases/sandboxes
Sandbox Docs: https://modal.com/docs/guide/sandboxes
Sandbox Networking: https://modal.com/docs/guide/sandbox-networking
API Reference: https://modal.com/docs/reference/modal.Sandbox
Safe Code Execution Example: https://modal.com/docs/examples/safe_code_execution
Coding Agent Example: https://modal.com/docs/examples/agent
Pricing: https://modal.com/pricing
Cold Start Guide: https://modal.com/docs/guide/cold-start
Volumes: https://modal.com/docs/guide/volumes
Security: https://modal.com/docs/guide/security

Cloudflare Sandboxes

Open Beta (announced June 2025), still experimental
Edge-native (330+ global locations)
Pay for active CPU only (not provisioned resources)
Best if already in Cloudflare ecosystem
R2 bucket mounting via FUSE enables data persistence (added November 2025)
Git operations support (added August 2025)
Rich output: charts, tables, HTML, JSON, images

Philosophy: Cloudflare takes a security-first approach using a "bindings" model where code has zero network access by default and can only access external APIs through explicitly defined bindings. This eliminates entire classes of security vulnerabilities by making capabilities explicitly opt-in.

Unique Features:

Edge-native execution: Run sandboxes in 330+ global locations
Bindings model: Zero network access by default; explicit opt-in for external APIs
R2 FUSE mounting: S3-compatible storage mounting for persistence (R2, S3, GCS, Backblaze B2, MinIO)
Preview URLs: Public URLs for exposing services from sandboxes
keepAlive: true: Option for indefinite runtime

Pricing (as of November 2025):

Resource	Cost	Included (Workers Paid)
Base Plan	$5/month	-
CPU	$0.000020/vCPU-s	375 vCPU-minutes
Memory	$0.0000025/GiB-s	25 GiB-hours
Disk	$0.00000007/GB-s	200 GB-hours
Network Egress	$0.025-$0.05/GB	Varies by region

Instance Types (added October 2025):

Type	vCPU	Memory	Disk
lite	1/16	256 MiB	2 GB
basic	1/4	1 GiB	4 GB
standard-1	1	3 GiB	5 GB
standard-2	2	6 GiB	10 GB
standard-4	4	12 GiB	20 GB

Specs:

Spec	Value
Isolation	Container
Cold Start	1-5 seconds
Edge Locations	330+ global
Storage	Ephemeral; persistent via R2 FUSE mount
Network	Bindings model (zero access by default)
Max Memory	400 GiB concurrent (account limit)
Max CPU	100 vCPU concurrent (account limit)
Max Disk	2 TB concurrent
Image Storage	50 GB per account

Limitations:

Cold starts 1-5 seconds (slower than Workers' milliseconds)
Binary network controls without bindings
Bucket mounting only works with wrangler deploy, not wrangler dev
SDK/container version must match
Sandbox ID case sensitivity issues with preview URLs
Still in open beta; ecosystem maturing

Links:

Official: https://sandbox.cloudflare.com/
SDK Documentation: https://developers.cloudflare.com/sandbox/
Containers Pricing: https://developers.cloudflare.com/containers/pricing/
Container Limits: https://developers.cloudflare.com/containers/platform-details/limits/
Persistent Storage Tutorial: https://developers.cloudflare.com/sandbox/tutorials/persistent-storage/
GitHub SDK: https://github.com/cloudflare/sandbox-sdk

Comparison Table

Feature	Sprites	E2B	Modal	Cloudflare
Setup	Easy	Very Easy	Easy	Easy
Free Tier	$30 credit	$100 credit	$30/month	$5/mo Workers Paid
Isolation	Firecracker microVM	Firecracker microVM	gVisor container	Container
Cold Start	<1 second	~150ms	2-5s (or <3s w/snap)	1-5 seconds
Max Timeout	None (persistent)	24 hours (Pro)	24 hours	Configurable
Claude CLI	Pre-installed	Prebuilt template	Manual	Manual
Git Support	Yes	Yes	Yes	Yes
Persistent Files	Yes (permanent)	24 hours	Via Volumes	Via R2 FUSE mount
Checkpoints	Yes (~300ms)	Pause/Resume (Beta)	Memory Snapshots	No
Network Controls	Layer 3 policies	Allow/deny lists	CIDR allowlists	Bindings model
Edge Locations	Fly.io regions	-	-	330+ global
Max Concurrent	10 active (base)	Plan-based	10,000+	Plan-based
Self-Hosting	Fly.io only	Experimental	No	No
MCP Tools	-	200+ (Docker)	-	-
Best For	Long-running agents	AI agent loops	Python ML workloads	Edge apps

Other Options

Daytona

Founded by the creators of Codeanywhere (2009), pivoted in February 2025 from development environments to AI code execution infrastructure. 35,000+ GitHub stars (AGPL-3.0 license).

Key Features:

Sub-90ms sandbox creation (container-based, faster than E2B's ~150ms microVM)
Python SDK (daytona_sdk on PyPI) and TypeScript SDK
Official LangChain integration (langchain-daytona-data-analysis)
MCP Server support for Claude/Anthropic integrations
OCI/Docker image compatibility
Built-in Git and LSP support
GPU support for ML workloads (enterprise tier)
Unlimited persistence (sandboxes can live forever via object storage archiving)
Virtual desktops (Linux, Windows, macOS with programmatic control)

Philosophy: Daytona believes AI will automate the majority of programming tasks. Their agent-agnostic architecture enables parallel sandboxed environments for testing solutions simultaneously without affecting the developer's primary workspace.

Unique Features:

Fastest cold start: ~90ms (container-based, faster than E2B's microVM)
LangChain integration: Official langchain-daytona-data-analysis package
MCP Server: Native Claude/Anthropic integration support
Virtual desktops: Linux, Windows, macOS with programmatic control
Unlimited persistence: Sandboxes can live forever via object storage archiving

Pricing:

Item	Cost
Free Credits	$200 (requires credit card)
Startup Program	Up to $50k in credits
Small Sandbox	~$0.067/hour (1 vCPU, 1 GiB RAM)
Billing	Pay-per-second; stopped/archived minimal cost

Specs:

Spec	Default	Maximum
vCPU	1	4 (contact for more)
RAM	1 GB	8 GB
Disk	3 GiB	10 GB
Auto-stop	15 min idle	Disabled
Cold start	~90ms	-
Isolation	Docker/OCI	Kata/Sysbox optional

Network Egress Tiers:

Tier 1 & 2: Restricted network access
Tier 3 & 4: Full internet with custom CIDR rules
All tiers whitelist essential services (NPM, PyPI, GitHub, Anthropic/OpenAI APIs, etc.)

Limitations:

Container isolation by default (not microVM like E2B)
Cannot snapshot running sandboxes
Long-session stability still maturing
Young ecosystem compared to E2B
Requires credit card for free credits

Daytona vs E2B:

Aspect	Daytona	E2B
Isolation	Docker containers	Firecracker microVM
Cold start	~90ms	~150ms
Free credits	$200 (CC required)	$100 (no CC)
Max session	Unlimited	24 hours (Pro)
GitHub stars	35k+	10k+
Network controls	Tier-based	Allow/deny lists

Links:

Official: https://www.daytona.io/
Documentation: https://www.daytona.io/docs/en/
GitHub: https://github.com/daytonaio/daytona
Python SDK: https://pypi.org/project/daytona_sdk/
Network Limits: https://www.daytona.io/docs/en/network-limits/
Sandbox Management: https://www.daytona.io/docs/en/sandbox-management/
LangChain Integration: https://docs.langchain.com/oss/python/integrations/tools/daytona_data_analysis
MCP Servers Guide: https://www.daytona.io/dotfiles/production-ready-mcp-servers-at-scale-with-claude-daytona

Google Cloud Run

Google Cloud's serverless container platform with strong security isolation, designed for production workloads at scale.

Key Features:

Two-layer sandbox isolation (hardware + kernel)
Automatic scaling (including scale-to-zero)
Pay-per-second billing (100ms granularity)
NVIDIA L4 GPU support for AI inference (24 GB VRAM)
Direct VPC egress with firewall controls
Cloud Storage and NFS volume mounts for persistence
Request timeout up to 60 minutes (services), 7 days (jobs)
Up to 1000 concurrent requests per instance
Built-in HTTPS, IAM, Secret Manager integration
Source-based deployment (no Dockerfile required)

Philosophy: Google's approach treats Cloud Run as the "easy button" for serverless containers. Unlike dedicated AI sandbox providers, Cloud Run is a general-purpose platform that happens to work well for AI agents. The security model provides defense-in-depth through gVisor (1st gen) or Linux microVMs (2nd gen), with seccomp filtering in both. For AI-specific workloads, Google offers Agent Engine (fully managed) and GKE Agent Sandbox (Kubernetes-native) as alternatives.

Unique Features:

Dual execution environments: 1st gen (gVisor-based, smaller attack surface) or 2nd gen (Linux microVM, more compatibility)
GPU scale-to-zero: L4 GPUs spin down when idle, eliminating GPU idle costs
Startup CPU Boost: Temporarily increases CPU during cold start (up to 50% faster startups for Java)
VPC Flow Logs: Full visibility into network traffic for compliance
Network tags: Granular firewall rules via VPC network tags on revisions
Volume mounts: Cloud Storage FUSE or NFS (Cloud Filestore) for persistent data

Pricing (Tier 1 regions, e.g., us-central1):

Resource	On-Demand	Free Tier (Monthly)
CPU	$0.000024/vCPU-second	180,000 vCPU-seconds (~50 hrs)
Memory	$0.0000025/GiB-second	360,000 GiB-seconds (~100 hrs @ 1GB)
Requests	$0.40/million	2 million
GPU (L4)	~$0.67/hour	None

Always-on billing is ~30% cheaper than on-demand
Tier 2 regions (Asia, South America) are ~40% more expensive
GPU requires minimum 4 vCPU + 16 GiB memory
New customers: $300 free credits for 90 days

Specs:

Spec	Value
Isolation	Two-layer: hardware (x86 virtualization) + kernel (gVisor/microVM)
Cold Start	2-5 seconds typical; sub-second with Startup CPU Boost + min instances
Max Timeout	60 minutes (services), 168 hours/7 days (jobs)
Max Memory	32 GiB
Max CPU	8 vCPU (or 4 vCPU with GPU)
Max Concurrency	1000 requests/instance (default 80)
Max Instances	1000 (configurable)
GPU	NVIDIA L4 (24 GB VRAM), 1 per instance, <5s startup
Storage	Ephemeral; use Cloud Storage or NFS mounts for persistence

Network/Egress Controls:

Direct VPC egress without Serverless VPC Access connector
Network tags on service revisions for firewall rules
VPC Service Controls for data exfiltration prevention
Organization policies to enforce VPC-only egress
Cloud NAT supported for outbound IP control
VPC Flow Logs for traffic visibility

Limitations:

No persistent local disk (must use Cloud Storage or NFS volume mounts)
Cold start latency higher than E2B/Sprites (2-5s vs <1s) without pre-warming
Setup complexity: Requires GCP project, billing, IAM configuration
VPC complexity: Network egress controls require VPC setup
Job connection breaks: Jobs >1 hour may experience connection breaks during maintenance
GPU regions limited: L4 GPUs only available in select regions
No pre-built AI agent template (unlike E2B)
Memory/session management must be built manually

Google Cloud AI Agent Options Comparison:

Criteria	Cloud Run	Agent Engine	GKE Agent Sandbox
Setup Complexity	Medium	Low	High
Infrastructure Control	Medium	Low	High
Memory/Session Mgmt	Manual	Built-in	Manual
Isolation	gVisor/microVM	Built-in sandbox	gVisor + Kata
Cold Start	2-5s (sub-second w/pre-warm)	Sub-second	Sub-second (warm pools)
Best For	Flexible serverless	Fastest to prod	Enterprise scale

SDK/API Options:

Python: pip install google-cloud-run
Node.js: npm install @google-cloud/run
Go, Java, .NET, Ruby, PHP, Rust client libraries available
REST API and gcloud CLI
Terraform provider for IaC

Links:

Documentation: https://cloud.google.com/run/docs
AI Agents Guide: https://cloud.google.com/run/docs/ai-agents
Pricing: https://cloud.google.com/run/pricing
Security Design: https://cloud.google.com/run/docs/securing/security
Quotas & Limits: https://cloud.google.com/run/quotas
GPU Support: https://cloud.google.com/run/docs/configuring/services/gpu
VPC Egress: https://cloud.google.com/run/docs/configuring/vpc-direct-vpc
Volume Mounts: https://cloud.google.com/run/docs/configuring/services/cloud-storage-volume-mounts
Python SDK: https://pypi.org/project/google-cloud-run/
Quickstarts: https://cloud.google.com/run/docs/quickstarts

Replit

Full development environment with built-in LLM agent (Agent 3).

Key Features:

Agent 3 can run autonomously for up to 200 minutes without supervision
Self-testing loop: executes code, identifies errors, fixes, and reruns until tests pass
Proprietary testing system claimed to be 3x faster and 10x more cost-effective than Computer Use models
Can build other agents and automations from natural language descriptions
Built on 10+ years of infrastructure investment (custom file system, VM orchestration)

Philosophy: Replit positions as an "agent-first" platform focused on eliminating "accidental complexity" (per CEO Amjad Masad). Target audience: anyone who wants to build software, not just engineers. The goal is to make Agent the primary interface for software creation.

Unique Features:

Agent 3 autonomy: Up to 200 minutes of autonomous execution
Self-testing loop: Automatic error detection and fixing
Agent building: Can create other agents from natural language
Full IDE integration: Complete development environment, not just sandbox
MCP support: Integration guide available

Pricing:

Plan	Monthly	Agent Access	Compute	Storage
Starter	Free	Limited (daily cap)	1 vCPU, 2 GiB RAM	2 GiB
Core	$25	Full Agent 3	4 vCPU, 8 GiB RAM	50 GiB
Teams	$40/user	Full Agent 3 + RBAC	4 vCPU, 8 GiB RAM	50 GiB
Enterprise	Custom	Full + SSO/SAML	Custom	Custom

Core plan includes $25/month in AI credits
Teams plan includes $40/month in credits + 50 viewer seats
Annual billing: ~20% discount

Specs:

Spec	Starter	Core/Teams
vCPU	1	4
RAM	2 GiB	8 GiB
Storage	2 GiB	50 GiB
Agent Time	Daily limits	Up to 200 min

Limitations:

No public API for programmatic Agent access — designed exclusively for in-browser interactive use, not for CI/CD pipelines or external autonomous agent orchestration
Agent frequently gets stuck in loops on simple tasks
Over-autonomy risk (can override user intent)
External API authentication problems reported
Unpredictable credit consumption ($100-300/month reported overages)
Over 60% of developers report agent stalls/errors regularly (per surveys)
Notable July 2025 incident where Agent deleted a production database

Links:

Official: https://replit.com/
Pricing: https://replit.com/pricing
Agent 3 Announcement: https://blog.replit.com/introducing-agent-3-our-most-autonomous-agent-yet
2025 Year in Review: https://blog.replit.com/2025-replit-in-review
AI Billing Docs: https://docs.replit.com/billing/ai-billing
MCP Integration Guide: https://docs.replit.com/tutorials/mcp-in-3
Fast Mode Docs: https://docs.replit.com/replitai/fast-mode

Local Docker Options

Docker Official Sandboxes

Quick Start:

docker sandbox run claude                  # Basic
docker sandbox run -w ~/my-project claude  # Custom workspace
docker sandbox run claude "your task"      # With prompt
docker sandbox run claude -c               # Continue last session

Key Details:

Credentials stored in persistent volume docker-claude-sandbox-data
--dangerously-skip-permissions enabled by default
Base image includes: Node.js, Python 3, Go, Git, Docker CLI, GitHub CLI, ripgrep, jq
Container persists in background; re-running reuses same container
Non-root user with sudo access

Links: https://docs.docker.com/ai/sandboxes/claude-code/

Comparison: E2B vs Docker Local

Aspect	E2B (Cloud)	Docker Local
Setup	SDK call	`docker sandbox run`
Isolation	Firecracker microVM	Container
Cost	~$0.05/hr	Free (your hardware)
Max Duration	24 hours	Unlimited
Network	Full internet	Full internet
State Persistence	Session-based	Volume-based
Multi-tenant Safe	Yes	No (local only)
Best For	Production, CI/CD	Local dev, prototyping

Recommendation for This Project

For Production/Multi-tenant: Use E2B

Pre-built Claude Code template = zero setup friction
24-hour sessions handle long-running autonomous agents
Full filesystem for progress.txt, prd.json, git repos
Proven in production (Lovable, Quora use it)
True isolation (Firecracker microVM)
200+ MCP tools via Docker partnership

For Long-Running Persistent Agents: Use Sprites

No session time limits (persistent environments)
Transactional snapshots for version control of entire OS
Auto-sleep when idle reduces costs
Pre-installed Claude Code and AI CLI tools
Best for agents that need to maintain state across days/weeks

For Local Development: Use Docker Sandboxes

Quick prototyping: docker sandbox run claude
With git automation: claude-sandbox (TextCortex)
Minimal setup: Uses persistent credentials volume
Free - runs on your own hardware
Unlimited session duration

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sandbox Environments for AI Agent Workflows

Options

Sprites (Fly.io)

E2B

Modal

Cloudflare Sandboxes

Comparison Table

Other Options

Daytona

Google Cloud Run

Replit

Local Docker Options

Docker Official Sandboxes

Comparison: E2B vs Docker Local

Recommendation for This Project

For Production/Multi-tenant: Use E2B

For Long-Running Persistent Agents: Use Sprites

For Local Development: Use Docker Sandboxes

FilesExpand file tree

sandbox-environments.md

Latest commit

History

sandbox-environments.md

File metadata and controls

Sandbox Environments for AI Agent Workflows

Options

Sprites (Fly.io)

E2B

Modal

Cloudflare Sandboxes

Comparison Table

Other Options

Daytona

Google Cloud Run

Replit

Local Docker Options

Docker Official Sandboxes

Comparison: E2B vs Docker Local

Recommendation for This Project

For Production/Multi-tenant: Use E2B

For Long-Running Persistent Agents: Use Sprites

For Local Development: Use Docker Sandboxes