Skip to content

Conversation

Copy link

Copilot AI commented Feb 8, 2026

Adds an Android toolkit for agent interaction with Android devices/emulators, as part of the broader effort to support OS-level environments for computer and mobile use. The originally proposed UbuntuToolkit was dropped since it duplicates the existing TerminalToolkit.

AndroidToolkit

ADB-based toolkit providing 14 tools for device interaction:

  • Device management: get_device_info, execute_adb_command, get_current_activity
  • App lifecycle: install_app, uninstall_app, list_installed_apps, launch_app
  • Input simulation: tap, swipe, long_press, input_text, press_key
  • Screen observation: take_screenshot, get_ui_hierarchy (via uiautomator dump)

Supports multi-device targeting via device_serial and configurable timeouts.

from camel.toolkits import AndroidToolkit

android = AndroidToolkit(device_serial="emulator-5554")
android.tap(500, 800)
android.take_screenshot("home.png")
android.get_ui_hierarchy()

Files changed

  • camel/toolkits/android_toolkit.py — new toolkit
  • camel/toolkits/__init__.py — export AndroidToolkit
  • test/toolkits/test_android_toolkit.py — 28 unit tests with mocked ADB subprocess calls
Original prompt

This section details on the original issue you should resolve

<issue_title>[Feature Request] Add ubuntu and android toolkits and environments for computer and mobile use</issue_title>
<issue_description>### Required prerequisites

Motivation

Refer to:

Summary

Add full Ubuntu and Android execution environments—including toolkits, VMs/emulators, sandboxes, action/observation spaces, and orchestration—so CAMEL agents can operate across desktop and mobile ecosystems through GUI, API, or hybrid interaction modes.

This upgrade will enable richer multi-device autonomy and real-world agent behaviors.

Notes: some of the features are supported in https://github.com/camel-ai/camel/blob/master/camel/runtimes/ubuntu_docker_runtime.py and https://github.com/camel-ai/crab


🎯 Motivation

To build general autonomous agents, CAMEL needs platform-level environments where agents can interact with real operating systems, software, GUIs, and devices.
Right now, CAMEL lacks:

  • OS-specific toolkits for Ubuntu and Android
  • Full execution sandboxes
  • Standardized action/observation spaces for GUI, API, or hybrid control
  • Runtime orchestration across heterogeneous platforms
  • VNC/noVNC-based graphical access for agent visualization and debugging

Adding Ubuntu and Android support unlocks significant new research and practical applications.


📦 Proposed Additions


1. Ubuntu Toolkit + Execution Environment

Ubuntu Toolkit Capabilities

  • Standardized command execution
  • File system operations (read/write/search)
  • GUI automation (via pyautogui, X11, Wayland, or browser-based toolkit)
  • Package management (APT)
  • Networking tools
  • Optional agent MCP integrations
  • Execution of preinstalled software

Ubuntu VM / Sandbox / Runtime

  • Based on Ubuntu 22.04 LTS

  • Runs either as:

    • VM
    • Docker sandbox
    • Agent-safe runtime
  • With optional GUI stack using:

    • VNC server
    • noVNC browser-based access

Preinstalled Ubuntu Software

Potential defaults:

  • Python, Node, Java toolchains
  • Browsers (Firefox / Chromium)
  • Developer tools (git, curl, build-essential)
  • Automation packages (xdotool, wmctrl)
  • Optional AI/ML toolchains
  • Any desired MCPs or agent toolkits

2. Android Toolkit + Execution Environment

Android Toolkit Capabilities

  • ADB commands

  • App installation/removal

  • Input simulation: tap, swipe, long press

  • Typing/text events

  • Screenshot + screen recording

  • UI hierarchy extraction

  • Intent launching and permission control

  • Optional UI automation via:

    • uiautomator2
    • Appium
    • espresso (advanced)

Android Execution Environment

  • Android Emulator (x86/ARM)

  • Sandbox with ADB bridge

  • Optional GUI access via:

    • VNC server in emulator
    • noVNC in browser
  • Configurable:

    • Android version
    • Screen resolution
    • Device profile
    • Preinstalled apps

3. Action and Observation Spaces (GUI / API / Hybrid)

Action Spaces

Agents should be able to choose actions across different modalities:

GUI-Based Actions

  • Mouse movement/click
  • Keyboard events
  • Touch gestures (Android)
  • Window focus / switching

API-Based Actions

  • System commands
  • API calls exposed by toolkits
  • ADB commands
  • High-level task actions (e.g., “open browser”, “install package”)

Hybrid Actions

  • GUI fallback when API fails
  • API introspection + GUI execution
  • Multi-step execution chains across OS boundaries

Observation Spaces

  • Full-screen screenshots
  • Bounding-box detected UI elements
  • OCR text extraction
  • System logs
  • Output of terminal/ADB commands
  • Telemetry (CPU, RAM, network)
  • File system state

4. Orchestration Layer for Runtimes and Emulators

Centralized orchestration for:

  • Managing Ubuntu VMs/sandboxes
  • Launching Android emulators
  • Starting/stopping runtimes
  • Maintaining lifecycle of multiple environments
  • Synchronizing agent interactions
  • Logging, replay, and deterministic stepping

Possible orchestrator modes:

  • Local multi-runtime
  • Cluster/distributed runtimes
  • Dockerized
  • CI-friendly headless mode

5. Integration of VNC / noVNC

Why

To give agents and developers GUI visibility.

What to integrate

  • VNC servers for Ubuntu GUI
  • VNC embedded in Android Emulator
  • noVNC to expose GUI in browser
  • Agent-accessible screenshot + OCR utilities
  • Ability to switch betw...

✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.

Copilot AI changed the title [WIP] Add ubuntu and android toolkits for environments Add UbuntuToolkit and AndroidToolkit for computer and mobile use Feb 8, 2026
Copilot AI requested a review from lightaime February 8, 2026 02:41
Copilot AI changed the title Add UbuntuToolkit and AndroidToolkit for computer and mobile use Add AndroidToolkit for mobile device automation via ADB Feb 8, 2026
@lightaime lightaime marked this pull request as ready for review February 8, 2026 03:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Feature Request] Add ubuntu and android toolkits and environments for computer and mobile use

2 participants