Skip to content

[P4] Phase 6.5 Generic Sandbox Interface and Plugin Architecture #78

@frankbria

Description

@frankbria

Summary

Create an abstract sandbox interface that allows multiple sandbox backends (Docker, E2B, Daytona, etc.) to be used interchangeably. This enables a plugin architecture where new sandbox types can be added without modifying core Ralph code.

Problem Statement

Phases 6.1-6.4 implement Docker and E2B specifically. As more sandbox platforms emerge, we need:

  • Consistent interface across all sandbox types
  • Easy addition of new sandbox backends
  • Common configuration schema
  • Unified monitoring and lifecycle management
  • Community contributions without core code changes

Proposed Interface

Every sandbox backend must implement these operations:

# Lifecycle
sandbox_create <config>          # Create/start sandbox, return session ID
sandbox_destroy <session_id>     # Terminate and cleanup sandbox
sandbox_status <session_id>      # Get sandbox status (running/stopped/error)

# Execution
sandbox_exec <session_id> <cmd>  # Execute command in sandbox
sandbox_exec_async <session_id> <cmd>  # Non-blocking execution

# File Operations
sandbox_upload <session_id> <local> <remote>   # Copy file to sandbox
sandbox_download <session_id> <remote> <local> # Copy file from sandbox
sandbox_list <session_id> <path>               # List files in sandbox

# Streams
sandbox_logs <session_id>        # Stream stdout/stderr
sandbox_attach <session_id>      # Interactive session (if supported)

# Configuration
sandbox_validate_config <config> # Validate configuration before create
sandbox_capabilities             # Return supported features

Plugin Structure

Each sandbox backend is a separate file in lib/sandbox/:

lib/sandbox/
├── interface.sh      # Abstract interface definition
├── docker.sh         # Docker implementation
├── e2b.sh           # E2B implementation
├── daytona.sh       # Daytona implementation (future)
├── gitpod.sh        # Gitpod implementation (future)
└── local.sh         # Passthrough (no sandbox, for testing)

Configuration Schema

Unified configuration that works across all backends:

sandbox:
  # Required
  type: docker | e2b | daytona | gitpod | local
  
  # Common options (all backends)
  common:
    workdir: /workspace
    shell: /bin/bash
    timeout: 3600
    
  # Sync options (handled by generic sync layer)
  sync:
    strategy: snapshot | realtime | git
    include: ["src/**", "tests/**"]
    exclude: ["node_modules/**"]
    
  # Security options (backend interprets as supported)
  security:
    network: restricted
    filesystem: isolated
    resources:
      memory: 4g
      cpus: 2
      
  # Backend-specific options
  docker:
    image: "python:3.11"
    volumes: ["./data:/data:ro"]
    
  e2b:
    template: "python"
    api_key_env: E2B_API_KEY
    
  daytona:
    workspace: "my-workspace"
    provider: "docker"

Capability Detection

Backends report what features they support:

# Query backend capabilities
sandbox_capabilities docker
# Returns:
# {
#   "realtime_sync": true,
#   "network_policies": true,
#   "resource_limits": true,
#   "interactive_shell": true,
#   "persistence": true,
#   "cost_tracking": false
# }

sandbox_capabilities e2b
# Returns:
# {
#   "realtime_sync": false,
#   "network_policies": false,
#   "resource_limits": true,
#   "interactive_shell": true,
#   "persistence": false,
#   "cost_tracking": true
# }

Ralph adapts behavior based on capabilities:

  • Skip realtime sync if not supported
  • Skip network policies if not supported
  • Enable cost tracking UI if supported

Adding a New Backend

To add a new sandbox backend (e.g., Gitpod):

  1. Create lib/sandbox/gitpod.sh
  2. Implement all interface functions
  3. Define backend-specific config options
  4. Register in lib/sandbox/registry.sh
  5. Add tests

Example minimal implementation:

#!/usr/bin/env bash
# lib/sandbox/gitpod.sh

source "$(dirname "$0")/interface.sh"

gitpod_create() {
    local config="$1"
    # Gitpod-specific creation logic
    # Return session ID
}

gitpod_destroy() {
    local session_id="$1"
    # Gitpod-specific cleanup
}

gitpod_exec() {
    local session_id="$1"
    local cmd="$2"
    # Execute via Gitpod API
}

# ... implement all interface functions ...

gitpod_capabilities() {
    echo '{
        "realtime_sync": true,
        "network_policies": false,
        "resource_limits": true,
        "interactive_shell": true,
        "persistence": true,
        "cost_tracking": true
    }'
}

# Register this backend
register_sandbox_backend "gitpod" \
    gitpod_create \
    gitpod_destroy \
    gitpod_exec \
    gitpod_upload \
    gitpod_download \
    gitpod_logs \
    gitpod_status \
    gitpod_capabilities

Key Design Questions

  1. Interface Completeness

    • What's the minimal viable interface?
    • What optional methods should exist?
    • How to handle unsupported operations?
  2. Configuration Validation

    • Validate before attempting creation?
    • Backend-specific validation rules?
    • User-friendly error messages?
  3. Error Handling

    • Standard error codes across backends?
    • Retry policies?
    • Fallback to local execution on failure?
  4. Monitoring Abstraction

    • Unified log streaming interface
    • Status polling vs. push notifications
    • How does ralph-monitor adapt?
  5. Testing Strategy

    • Mock backend for testing core logic
    • Integration tests per backend
    • Capability-based test selection
  6. Plugin Discovery

    • Auto-discover backends in lib/sandbox/?
    • User-provided plugins from ~/.ralph/plugins/?
    • Plugin versioning?

Acceptance Criteria

  • Define abstract sandbox interface in interface.sh
  • Migrate Docker implementation to use interface
  • Migrate E2B implementation to use interface
  • Implement capability detection system
  • Unified configuration schema
  • Backend registration mechanism
  • Plugin discovery and loading
  • Fallback handling for unsupported features
  • Mock backend for testing
  • Documentation for creating new backends
  • Tests for interface compliance

Future Backends to Consider

  • Daytona - Self-hosted dev environments
  • Gitpod - Cloud-based workspaces
  • GitHub Codespaces - GitHub-integrated environments
  • Replit - Browser-based development
  • Modal - Serverless compute
  • Fly.io - Edge compute
  • Railway - App hosting with ephemeral environments

Dependencies

Related

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions