Skip to content

Latest commit

 

History

History
476 lines (334 loc) · 15.6 KB

File metadata and controls

476 lines (334 loc) · 15.6 KB

Development Guide

Guide for contributing to HiClaw, building images locally, and running tests.

Prerequisites

  • Docker (for building and testing)
  • Git
  • mc (MinIO Client) for running integration tests
  • jq for JSON processing in test scripts

Project Structure

See ../AGENTS.md for a comprehensive codebase navigation guide.

Building Images Locally

All builds go through the root Makefile:

# Build both Manager and Worker images (native arch, for local dev/test)
make build

# Build only Manager
make build-manager

# Build only Worker
make build-worker

# Build with a specific version tag
make build VERSION=0.1.0

# Build for a specific platform
make build DOCKER_PLATFORM=linux/amd64

See all available targets with make help.

Push Images (Multi-Architecture by Default)

make push always builds multi-arch manifests (amd64 + arm64) to avoid accidentally overwriting multi-arch images with a single-arch image. This uses docker buildx:

# Build amd64 + arm64 and push to registry
make push VERSION=0.1.0 REGISTRY=ghcr.io REPO=higress-group/hiclaw

# Push only Manager (multi-arch)
make push-manager VERSION=latest

# Customize platforms (default: linux/amd64,linux/arm64)
make push MULTIARCH_PLATFORMS=linux/amd64,linux/arm64,linux/arm/v7

Note: make push always pushes directly to the registry (required by buildx multi-platform builds — multi-arch images cannot be stored in the local Docker image store). You must docker login to the target registry first.

For local development and testing, use make build which creates native-arch images locally. If you absolutely need to push single-arch images (not recommended), use make push-native.

Regional Registry Mirrors

All base images are hosted on the Higress Alibaba Cloud registry with regional mirrors that auto-sync from the primary (cn-hangzhou). Choose the nearest region for faster pulls:

Region Registry Usage
China (default) higress-registry.cn-hangzhou.cr.aliyuncs.com Default, no extra args needed
North America higress-registry.us-west-1.cr.aliyuncs.com make build HIGRESS_REGISTRY=higress-registry.us-west-1.cr.aliyuncs.com
Southeast Asia higress-registry.ap-southeast-7.cr.aliyuncs.com make build HIGRESS_REGISTRY=higress-registry.ap-southeast-7.cr.aliyuncs.com

Or pass it directly to Docker:

docker build --build-arg HIGRESS_REGISTRY=higress-registry.us-west-1.cr.aliyuncs.com -t hiclaw/manager-agent:latest ./manager/

Install / Uninstall / Replay

Quick Install (Minimal)

Only HICLAW_LLM_API_KEY is required — everything else uses sensible defaults:

# Build images + install Manager with one command
HICLAW_LLM_API_KEY="sk-xxx" make install

This will:

  1. Build Manager and Worker images (make build)
  2. Run the install script (install/hiclaw-install.sh manager)
  3. Mount the container runtime socket for direct Worker creation
  4. Save configuration to ./hiclaw-manager.env

Custom Install

Override any configuration via environment variables:

HICLAW_LLM_API_KEY="sk-xxx" \
HICLAW_LLM_PROVIDER="openai" \
HICLAW_DEFAULT_MODEL="gpt-4o" \
HICLAW_ADMIN_USER="myadmin" \
HICLAW_ADMIN_PASSWORD="mypassword" \
make install

Uninstall

make uninstall   # Stops Manager, removes all Worker containers, volume, and env file

Replay (Send Task to Manager)

After installing, send tasks to the Manager via the Matrix protocol:

# CLI mode: pass task as argument
make replay TASK="Create a Worker named alice for frontend development"

# Interactive mode: prompts for input
make replay

# Pipe mode: read from stdin
echo "Create worker bob" | ./scripts/replay-task.sh

The replay script:

  • Reads credentials from ./hiclaw-manager.env
  • Logs into Matrix as the admin user
  • Finds (or auto-creates) the DM room with the Manager
  • Sends the task message
  • Waits for and prints the Manager's reply (configurable via REPLAY_WAIT=0 to skip)
  • Saves conversation logs to logs/replay/replay-{timestamp}.log (view with make replay-log)

Test Against Installed Manager

After make install, run the test suite against the running Manager without rebuilding or creating a new container:

make test-installed

# Or with a test filter
make test-installed TEST_FILTER="01 02"

This reads credentials from ./hiclaw-manager.env and skips the container lifecycle (start/stop).

Running Tests

Full Integration Test Suite

# Build images + run all 10 test cases
export HICLAW_LLM_API_KEY="your-api-key"
make test

Run Specific Tests

# Run only tests 01, 02, 03
make test TEST_FILTER="01 02 03"

Skip Image Build

# Use existing images (faster iteration)
make test SKIP_BUILD=1

Quick Smoke Test

# Run test-01 only (quick health check)
make test-quick

GitHub Tests (08-10)

Tests 08-10 require a GitHub Personal Access Token:

export HICLAW_GITHUB_TOKEN="ghp_..."
make test TEST_FILTER="08 09 10"

Making Changes

Modifying Agent Behavior

Agent behavior is defined by markdown files, not code:

  • Manager SOUL: manager/agent/SOUL.md
  • Manager Heartbeat: manager/agent/HEARTBEAT.md
  • Skills: manager/agent/skills/*/SKILL.md

Modifying Startup Sequence

Each component has its own startup script in manager/scripts/init/:

  • Modify the relevant start-*.sh script
  • Rebuild the Manager image
  • Run tests to verify

Modifying Higress Configuration

Route, consumer, and MCP server initialization is in manager/scripts/init/setup-higress.sh. This runs once during Manager startup.

Adding a New MCP Server

  1. Add the server configuration to setup-higress.sh
  2. Create a Worker skill SKILL.md documenting available tools
  3. Update manager/agent/worker-agent/skills/ with the new skill
  4. Add test coverage in tests/

CI/CD

GitHub Actions Workflows

Workflow Trigger Purpose Arch
build.yml PRs to main Build only (no push, fast feedback) amd64
build.yml Push to main Multi-arch build + push amd64 + arm64
integration-test.yml After build succeeds on main Run full test suite amd64 (runner native)
release.yml Version tags v* Multi-arch build + push release images amd64 + arm64

All CI multi-arch builds use docker/setup-qemu-action for cross-platform emulation and docker buildx via make push.

Required Secrets

Secret Purpose
HICLAW_LLM_API_KEY LLM access for Agent behavior tests
HICLAW_GITHUB_TOKEN GitHub operations in tests 08-10

Local CI Simulation

# Same flow as CI but local (single arch)
export HICLAW_LLM_API_KEY="your-key"
make test

# Multi-arch build like CI does on main branch
docker login ghcr.io
make push VERSION=latest REGISTRY=ghcr.io REPO=higress-group/hiclaw

Network Proxy Setup (China Mainland)

Building images requires accessing GitHub (for git clone) and npm registries. In China mainland environments, a proxy is typically needed.

Host Proxy

Enable proxy in your shell before running commands:

# Enable proxy (adjust ports to your proxy config)
export http_proxy="http://127.0.0.1:1087"
export https_proxy="http://127.0.0.1:1087"
export ALL_PROXY="socks5://127.0.0.1:1086"

# IMPORTANT: exclude localhost from proxy, otherwise test health checks fail
export no_proxy="localhost,127.0.0.1,::1,local,169.254/16"

Docker Build Proxy

Docker build runs in an isolated environment and does NOT inherit host proxy settings. Pass proxy via DOCKER_BUILD_ARGS:

make build-manager DOCKER_BUILD_ARGS="--build-arg http_proxy=http://host.docker.internal:1087 --build-arg https_proxy=http://host.docker.internal:1087"

Note: host.docker.internal resolves to the host machine from within Docker containers. If your proxy listens on 127.0.0.1:1087 on the host, use host.docker.internal:1087 in build args.

Running Tests with Proxy

When running tests, no_proxy is critical — without it, test health checks to 127.0.0.1 go through the proxy and return 503:

export no_proxy="localhost,127.0.0.1,::1,local,169.254/16"
HICLAW_LLM_API_KEY="your-key" make test SKIP_BUILD=1

Container Runtime Socket (Direct Worker Creation)

When the Manager container is started with the host's container runtime socket mounted, it can create Worker containers directly — no human intervention needed for local deployments.

How It Works

The Manager detects the socket at startup and sets HICLAW_CONTAINER_RUNTIME=socket. The container-api.sh script provides functions to create/start/stop Worker containers via the Docker-compatible REST API (works with both Docker and Podman).

Socket Paths

Runtime Socket path (host) Mount command
Docker /var/run/docker.sock -v /var/run/docker.sock:/var/run/docker.sock
Podman (rootful, Linux) /run/podman/podman.sock -v /run/podman/podman.sock:/var/run/docker.sock --security-opt label=disable
Podman (macOS machine) Inside VM: /run/podman/podman.sock Same as rootful (VM provides symlink at /var/run/docker.sock)

Example: Manual Start with Socket

# Docker
docker run -d --name hiclaw-manager \
  -v /var/run/docker.sock:/var/run/docker.sock \
  -e HICLAW_WORKER_IMAGE=hiclaw/worker-agent:latest \
  ... \
  hiclaw/manager-agent:latest

# Podman
podman run -d --name hiclaw-manager \
  -v /run/podman/podman.sock:/var/run/docker.sock \
  --security-opt label=disable \
  -e HICLAW_WORKER_IMAGE=hiclaw/worker-agent:latest \
  ... \
  hiclaw/manager-agent:latest

Test Integration

The test orchestrator (tests/run-all-tests.sh) automatically detects the socket and mounts it when available.

Security Note

Mounting the container runtime socket gives the container full control over the host's container runtime (equivalent to root access). This is acceptable for local development. In production, consider using more restrictive approaches like Podman socket activation with limited API access.

Key Technical Notes

Node.js Version

OpenClaw requires Node.js >= 22 (the --disable-warning flag used internally requires Node.js 21.3+). The Manager image is built on openclaw-base which already includes Node.js 22. The Worker Dockerfile copies Node.js 22 from a build stage.

  • Manager: Node 22 is provided by openclaw-base (the base image already includes it).
  • Worker: Node 22 binary copied from build stage replaces Ubuntu 24.04 apt's Node.js 18.x (which lacks --disable-warning support).

Higress AI Provider API

When creating a qwen type provider via the Higress Console API, you must include rawConfigs with Qwen-specific fields — otherwise the API returns "Missing Qwen specific configurations". The correct body is:

{
  "type": "qwen",
  "name": "qwen",
  "tokens": ["your-api-key"],
  "protocol": "openai/v1",
  "rawConfigs": {
    "qwenEnableSearch": false,
    "qwenEnableCompatible": true,
    "qwenFileIds": []
  }
}

For OpenAI-compatible providers (DeepSeek, etc.), use type=openai with rawConfigs.apiUrl pointing to the provider's endpoint. This is all handled automatically in manager/scripts/init/setup-higress.sh.

OpenClaw Skills Format

SKILL.md files must include a YAML front matter block, otherwise OpenClaw will not discover them:

---
name: my-skill-name
description: What this skill does and when to use it
---

# Skill Title
...content...

Skills placed in <workspace>/skills/<name>/SKILL.md are auto-discovered (source: openclaw-workspace).

OpenClaw Gateway Config

The openclaw.json must include gateway configuration for headless operation:

{
  "gateway": {
    "mode": "local",
    "port": 18799,
    "auth": { "token": "<some-token>" }
  }
}

Without gateway.mode=local, OpenClaw refuses to start. Without gateway.auth.token, it also refuses.

Code Style

  • Shell scripts: use ${VAR} syntax, functions for reusable logic
  • Config templates: ${VAR} placeholders, comments explaining each field
  • Skills (SKILL.md): Must include YAML front matter (name + description), self-contained with full API reference and examples
  • Tests: One file per acceptance case, source shared helpers, use assertion functions

Debugging Tips

View Manager Logs

Container name is hiclaw-manager (via make install) or hiclaw-manager-test (via make test).

# Component logs are split by service
docker exec hiclaw-manager cat /var/log/hiclaw/manager-agent.log     # startup + setup-higress
docker exec hiclaw-manager cat /var/log/hiclaw/manager-agent-error.log  # OpenClaw gateway stderr
docker exec hiclaw-manager cat /var/log/hiclaw/higress-console.log
docker exec hiclaw-manager cat /var/log/hiclaw/tuwunel.log

# OpenClaw runtime log (agent events, tool calls, LLM interactions)
docker exec hiclaw-manager bash -c 'cat /tmp/openclaw/openclaw-*.log' | jq .

View Replay Conversation Logs

# After running `make replay`, logs are saved automatically
make replay-log

# Logs directory: logs/replay/replay-{timestamp}.log

Check OpenClaw Skills Loading

docker exec hiclaw-manager bash -c \
  'OPENCLAW_CONFIG_PATH=/root/manager-workspace/openclaw.json openclaw skills list --json' \
  | jq '.skills[] | select(.source == "openclaw-workspace") | {name, eligible, description}'

Interactive Shell in Container

docker exec -it hiclaw-manager bash

Check Higress State

# Login (init uses "name", login uses "username")
curl -X POST http://localhost:18001/session/login \
  -H 'Content-Type: application/json' \
  -c /tmp/cookie \
  -d '{"username": "admin", "password": "your-password"}'

# List consumers
curl -s http://localhost:18001/v1/consumers -b /tmp/cookie | jq

# List routes
curl -s http://localhost:18001/v1/routes -b /tmp/cookie | jq

# List AI providers
curl -s http://localhost:18001/v1/ai/providers -b /tmp/cookie | jq

# List AI routes
curl -s http://localhost:18001/v1/ai/routes -b /tmp/cookie | jq

Check MinIO State

mc alias set test http://localhost:9000 <user> <password>
mc ls test/hiclaw-storage/ --recursive

Common Issues

Symptom Cause Fix
git clone hangs during docker build No proxy in build env Pass --build-arg http_proxy=... via DOCKER_BUILD_ARGS
Health checks return 503 http_proxy capturing localhost requests Set no_proxy=localhost,127.0.0.1,::1
OpenClaw: SyntaxError: Unexpected reserved word Node.js too old Ensure Manager uses openclaw-base image; Worker uses Node.js 22 from build stage
OpenClaw: requires Node >=22.0.0 Same as above Same as above
--disable-warning= is not allowed in NODE_OPTIONS Node.js < 21.3 (e.g., Ubuntu apt's v18) Ensure Worker uses Node.js 22 from build stage, not apt
OpenClaw: gateway.mode=local required Missing gateway config in openclaw.json Add "gateway": {"mode": "local", ...}
OpenClaw: no token is configured Missing gateway auth token Add "gateway": {"auth": {"token": "..."}}
Higress: Missing Qwen specific configurations type=qwen requires rawConfigs fields Include rawConfigs: {qwenEnableCompatible: true, ...} — see setup-higress.sh
Skills not loaded by OpenClaw Missing YAML front matter in SKILL.md Add ---\nname: ...\ndescription: ...\n---
setup-higress.sh crashes on restart set -e + curl -sf on "already exists" errors Use higress_api() helper which handles this gracefully
Higress setup runs again on restart, resets consumers Missing setup marker Check /data/.higress-setup-done; delete it only to force re-setup