Skip to content

[P4] Phase 6.4 Sandbox Security and Resource Policies #77

@frankbria

Description

@frankbria

Summary

Implement fine-grained security and resource control policies for sandbox execution. This allows users to restrict what sandboxed Ralph can do, protecting against malicious or runaway code.

Problem Statement

Autonomous code execution is inherently risky. Even in a sandbox, code could:

  • Exfiltrate sensitive data over the network
  • Consume excessive resources (CPU, memory, disk)
  • Make unauthorized API calls
  • Access credentials/secrets improperly
  • Run indefinitely, wasting resources

Users need controls to limit blast radius and enforce security boundaries.

Security Policy Dimensions

1. Network Policies

Control what network access the sandbox has.

# No network access (highest security)
ralph --sandbox docker --network none

# Allow only essential services
ralph --sandbox docker --network restricted
# Allows: api.anthropic.com, npm registry, pip registry

# Custom allowlist
ralph --sandbox docker --network-allow "api.anthropic.com,github.com"

# Custom denylist
ralph --sandbox docker --network-deny "*.internal.company.com"

# Full access (lowest security)
ralph --sandbox docker --network open

Restricted mode default allowlist:

  • api.anthropic.com (Claude API)
  • registry.npmjs.org (npm packages)
  • pypi.org, files.pythonhosted.org (pip packages)
  • github.com (if git operations needed)

2. Filesystem Policies

Control filesystem access within the sandbox.

# Read-only project, writable output directory
ralph --sandbox docker --fs-policy restricted

# Specify writable paths explicitly
ralph --sandbox docker --writable "/workspace/output,/workspace/logs"

# Read-only root filesystem
ralph --sandbox docker --read-only-root

# Prevent deletion of certain files
ralph --sandbox docker --protect "PROMPT.md,@fix_plan.md"

3. Resource Limits

Prevent resource exhaustion.

# Memory limit
ralph --sandbox docker --memory 4g
ralph --sandbox e2b --memory 8g

# CPU limit
ralph --sandbox docker --cpus 2

# Disk space limit
ralph --sandbox docker --disk 10g

# Execution time limit (entire session)
ralph --sandbox docker --max-duration 2h
ralph --sandbox e2b --max-duration 30m

# Per-loop time limit
ralph --sandbox docker --loop-timeout 15m

# Cost limit (for paid sandboxes)
ralph --sandbox e2b --max-cost 10.00

4. Secret Management

Secure handling of credentials.

# Inject secrets from file (not in environment)
ralph --sandbox docker --secrets-file ~/.ralph/secrets

# Inject specific secrets
ralph --sandbox docker --secret ANTHROPIC_API_KEY

# Use secret manager
ralph --sandbox docker --secrets-from aws-secrets-manager
ralph --sandbox docker --secrets-from 1password

Secret injection methods:

  • Mounted file (preferred): /run/secrets/ANTHROPIC_API_KEY
  • Docker secrets (for Docker sandbox)
  • Secure environment (cleared after read)

Secret hygiene:

  • Never log secrets
  • Mask secrets in output
  • Clear secrets from memory after use
  • No secrets in sync'd files

5. Capability Restrictions

Fine-grained syscall/capability control (Docker-specific).

# Drop all capabilities except essential
ralph --sandbox docker --cap-drop ALL --cap-add NET_RAW

# No privilege escalation
ralph --sandbox docker --no-new-privileges

# User namespace isolation
ralph --sandbox docker --userns host

Policy Presets

Predefined security profiles for common use cases.

# Maximum security (paranoid)
ralph --sandbox docker --policy paranoid
# - No network
# - Read-only filesystem
# - 1GB memory, 1 CPU
# - 30 minute timeout

# Standard security (default)
ralph --sandbox docker --policy standard
# - Restricted network (allowlist)
# - Writable /workspace
# - 4GB memory, 2 CPUs
# - 2 hour timeout

# Permissive (development)
ralph --sandbox docker --policy permissive
# - Full network
# - Full filesystem access
# - No resource limits
# - No timeout

Configuration File

Policies can be defined in .ralphrc or ~/.ralph/config.yaml:

sandbox:
  security:
    network:
      mode: restricted  # none | restricted | open
      allow:
        - api.anthropic.com
        - "*.npmjs.org"
      deny:
        - "*.internal.corp"
        
    filesystem:
      readOnlyRoot: true
      writable:
        - /workspace/output
        - /workspace/logs
      protected:
        - PROMPT.md
        - "@fix_plan.md"
        
    resources:
      memory: 4g
      cpus: 2
      disk: 10g
      maxDuration: 2h
      loopTimeout: 15m
      
    secrets:
      source: file  # file | env | aws | 1password
      path: ~/.ralph/secrets
      inject:
        - ANTHROPIC_API_KEY
        - GITHUB_TOKEN

Key Design Questions

  1. Default Policy

    • Should default be restrictive or permissive?
    • Different defaults for different sandbox types?
  2. Policy Validation

    • Validate policies before starting sandbox?
    • Warn on potentially dangerous configurations?
  3. Policy Enforcement

    • How to enforce network policies? (iptables, network namespaces)
    • How to enforce filesystem policies? (mounts, AppArmor, SELinux)
    • E2B may have different enforcement mechanisms
  4. Secret Rotation

    • Handle secret expiration mid-session?
    • Re-inject rotated secrets?
  5. Audit Logging

    • Log policy violations?
    • Log resource usage?
    • Security audit trail?
  6. Policy Inheritance

    • Project-level policies override user defaults?
    • Or user defaults take precedence for security?

Acceptance Criteria

  • Network policy: none, restricted, open modes
  • Network allowlist/denylist configuration
  • Filesystem read-only and writable path controls
  • Resource limits: memory, CPU, disk, time
  • Cost limits for cloud sandboxes
  • Secret injection via mounted files
  • Secret masking in logs
  • Policy presets (paranoid, standard, permissive)
  • Configuration file support for policies
  • Policy validation on startup
  • Tests for policy enforcement

Dependencies

  • Network namespaces / iptables (Docker)
  • Filesystem mounts and permissions
  • Docker capabilities system
  • Secret management infrastructure

Related

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions