Skip to content

Commit 768e724

Browse files
Reconcilation, blueprint and node support (#160)
* support for reconcil * improvements * improved guide * improvements * improvements * renamed to bleprint * added support for nodes * removed node.test * fix to make test run * improved doc * typo in blueprint
1 parent 0991b54 commit 768e724

104 files changed

Lines changed: 11236 additions & 4921 deletions

File tree

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

Makefile

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@ all: build
22
.PHONY: all build
33

44
BUILD_IMAGE ?= colonyos/colonies
5-
PUSH_IMAGE ?= colonyos/colonies:v1.9.0
5+
PUSH_IMAGE ?= colonyos/colonies:v1.9.1
66

77
VERSION := $(shell git rev-parse --short HEAD)
88
BUILDTIME := $(shell date -u '+%Y-%m-%dT%H:%M:%SZ')

README.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -19,7 +19,7 @@ Example of use cases:
1919
- **Data Processing**: ETL pipelines, batch processing, real-time stream processing with ColonyFS integration
2020
- **Industrial IoT**: Coordinate computations across factory floor devices, edge gateways, and cloud
2121
- **Earth Observation**: Automated satellite image processing and analysis workflows
22-
- **Infrastructure as Code**: Declaratively manage infrastructure across computing continuums - define resources spanning cloud, edge, HPC, and IoT with GitOps workflows, automatic drift detection, and self-healing reconciliation
22+
- **Infrastructure as Code**: Declaratively manage infrastructure across computing continuums - define services spanning cloud, edge, HPC, and IoT with GitOps workflows, automatic drift detection, and self-healing reconciliation
2323

2424
### The Core Idea
2525

@@ -46,7 +46,7 @@ Instead of writing platform-specific code, you declare **WHAT** you want to comp
4646
- **Event-Driven**: Real-time WebSocket subscriptions for process state changes
4747
- **Scheduled Execution**: Cron-based and interval-based job scheduling
4848
- **Dynamic Batching**: Generators that pack arguments and trigger workflows based on counter or timeout conditions
49-
- **Resource Reconciliation**: Kubernetes-style declarative resource management with automatic drift detection and correction
49+
- **Service Reconciliation**: Kubernetes-style declarative service management with automatic drift detection and correction
5050
- **Full Audit Trail**: Complete execution history stored as an immutable ledger
5151
- **High Availability**: Etcd-based clustering with automatic failover
5252
- **Multi-Language SDKs**: Go, Rust, Python, Julia, JavaScript, Haskell
@@ -60,8 +60,8 @@ Instead of writing platform-specific code, you declare **WHAT** you want to comp
6060
- **Process**: Computational workload with states: WAITING → RUNNING → SUCCESS/FAILED
6161
- **FunctionSpec**: Specification defining what computation to run and execution conditions
6262
- **ProcessGraph**: Workflow represented as a Directed Acyclic Graph (DAG)
63-
- **Resource**: Declarative infrastructure specification with desired state management
64-
- **Reconciliation**: Automatic drift detection and correction that maintains resources in their desired state
63+
- **Service**: Declarative infrastructure specification with desired state management
64+
- **Reconciliation**: Automatic drift detection and correction that maintains services in their desired state
6565

6666
### How It Works
6767

deployment/catalog/README.md

Lines changed: 191 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,191 @@
1+
# Blueprint Catalog
2+
3+
This directory contains reusable blueprint specifications for common deployments.
4+
5+
## Files
6+
7+
### executor-deployment-definition.json
8+
The **BlueprintDefinition** for ExecutorDeployment kind. This must be registered once (by colony owner) before creating ExecutorDeployment blueprints.
9+
10+
**Register once:**
11+
```bash
12+
export COLONIES_PRVKEY=${COLONIES_COLONY_PRVKEY}
13+
colonies blueprint definition add --spec executor-deployment-definition.json
14+
```
15+
16+
### docker-executor-deployment.json
17+
Deploys a docker executor specifically on the **local/main node**.
18+
19+
**Key Settings:**
20+
- `executorType`: `docker-reconciler` - Requires a docker-reconciler
21+
- `executorName`: `local-docker-node-reconciler` - Targets the main node
22+
- `replicas`: 1 - Single executor instance
23+
24+
**Deploy:**
25+
```bash
26+
colonies blueprint add --spec docker-executor-deployment.json
27+
```
28+
29+
**Result:** The deployment will run specifically on the `local-docker-node-reconciler` (main node from colonies docker-compose).
30+
31+
## Executor Targeting Examples
32+
33+
### Example 1: Target Specific Node (Pinned - Current)
34+
```json
35+
{
36+
"kind": "ExecutorDeployment",
37+
"metadata": {
38+
"name": "docker-executor"
39+
},
40+
"spec": {
41+
"executorType": "docker-reconciler",
42+
"executorName": "local-docker-node-reconciler" // Main node
43+
}
44+
}
45+
```
46+
✅ Guaranteed deployment on specific node
47+
⚠️ Fails if that reconciler is down
48+
49+
### Example 2: Any Node (Load Balanced)
50+
```json
51+
{
52+
"kind": "ExecutorDeployment",
53+
"metadata": {
54+
"name": "docker-executor-any"
55+
},
56+
"spec": {
57+
"executorType": "docker-reconciler"
58+
// No executorName - any reconciler can handle it
59+
}
60+
}
61+
```
62+
✅ High availability - survives individual node failures
63+
✅ Automatic load distribution
64+
⚠️ You don't control which node runs it
65+
66+
### Example 3: Target Local Node Alternative
67+
```json
68+
{
69+
"kind": "ExecutorDeployment",
70+
"metadata": {
71+
"name": "docker-executor-local"
72+
},
73+
"spec": {
74+
"executorType": "docker-reconciler",
75+
"executorName": "local-docker-node-reconciler" // Specific node
76+
}
77+
}
78+
```
79+
✅ Guaranteed deployment on specific node
80+
⚠️ Fails if that reconciler is down
81+
82+
**Available reconcilers in default setup:**
83+
- `local-docker-node-reconciler` - Main node (in colonies docker-compose)
84+
- `docker-reconciler-edge` - Edge node (in docker-reconciler docker-compose)
85+
86+
### Example 3: Target Edge Node
87+
```json
88+
{
89+
"kind": "ExecutorDeployment",
90+
"metadata": {
91+
"name": "docker-executor-edge"
92+
},
93+
"spec": {
94+
"executorType": "docker-reconciler",
95+
"executorName": "docker-reconciler-edge" // Edge datacenter
96+
}
97+
}
98+
```
99+
100+
See [../../executors/docker-reconciler/examples/](../../executors/docker-reconciler/examples/) for more examples.
101+
102+
## Usage Workflow
103+
104+
### 1. Register Blueprint Definition (One-time)
105+
```bash
106+
# As colony owner
107+
export COLONIES_PRVKEY=${COLONIES_COLONY_PRVKEY}
108+
colonies blueprint definition add --spec executor-deployment-definition.json
109+
```
110+
111+
### 2. Deploy Executor
112+
```bash
113+
# Deploy to any available node
114+
colonies blueprint add --spec docker-executor-deployment.json
115+
116+
# Check status
117+
colonies blueprint get --name docker-executor
118+
119+
# Watch reconciliation
120+
colonies process ps
121+
```
122+
123+
### 3. Scale Deployment
124+
```bash
125+
# Scale to 3 replicas
126+
colonies blueprint set --name docker-executor --key spec.replicas --value 3
127+
128+
# Scale down to 1
129+
colonies blueprint set --name docker-executor --key spec.replicas --value 1
130+
```
131+
132+
### 4. Update Image
133+
```bash
134+
colonies blueprint set --name docker-executor \
135+
--key spec.image --value colonyos/dockerexecutor:v1.0.8
136+
```
137+
138+
### 5. Monitor
139+
```bash
140+
# List all blueprints
141+
colonies blueprint ls
142+
143+
# View history
144+
colonies blueprint history --name docker-executor
145+
146+
# Check running executors
147+
colonies executor ls
148+
149+
# Check running containers (on reconciler node)
150+
docker ps --filter label=colonies.blueprint=docker-executor
151+
```
152+
153+
## Environment Configuration
154+
155+
The example includes complete environment configuration for:
156+
- **ColonyOS Connection**: Server host, port, security
157+
- **Colony Credentials**: Name and private key
158+
- **S3/MinIO Storage**: For file operations
159+
- **Executor Metadata**: Type, capabilities, location
160+
161+
All environment variables can be customized via blueprint updates:
162+
```bash
163+
colonies blueprint set --name docker-executor \
164+
--key spec.env.EXECUTOR_GPU --value 1
165+
```
166+
167+
## Network Configuration
168+
169+
**Important:** The examples use `COLONIES_SERVER_HOST=colonies-server` which works when:
170+
- Both reconciler and colonies-server are on the same Docker network
171+
- The network has the service name `colonies-server` defined
172+
173+
If running reconcilers outside Docker or on different networks, use:
174+
- `host.docker.internal` (Docker Desktop on Mac/Windows)
175+
- Host IP address (e.g., `192.168.1.100`)
176+
- Never use `localhost` inside containers
177+
178+
## Volumes
179+
180+
The examples mount two volumes:
181+
1. `/var/run/docker.sock` - Required for Docker API access (Docker-in-Docker)
182+
2. `/tmp/colonies` - Shared filesystem for data exchange
183+
184+
**Security Note:** Mounting Docker socket gives container full Docker API access. Use `privileged: true` only when necessary.
185+
186+
## See Also
187+
188+
- [../../docs/Blueprints.md](../../docs/Blueprints.md) - Complete blueprint documentation
189+
- [../../docs/Reconciliation.md](../../docs/Reconciliation.md) - How reconciliation works
190+
- [../../executors/docker-reconciler/README.md](../../executors/docker-reconciler/README.md) - Reconciler documentation
191+
- [../../executors/docker-reconciler/examples/](../../executors/docker-reconciler/examples/) - More examples
Lines changed: 66 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,66 @@
1+
{
2+
"kind": "ExecutorDeployment",
3+
"metadata": {
4+
"name": "docker-executor"
5+
},
6+
"spec": {
7+
"image": "colonyos/dockerexecutor:v1.0.7",
8+
"replicas": 1,
9+
"executorType": "docker-reconciler",
10+
"executorName": "local-docker-node-reconciler",
11+
"env": {
12+
"LANG": "en_US.UTF-8",
13+
"LANGUAGE": "en_US.UTF-8",
14+
"LC_ALL": "en_US.UTF-8",
15+
"LC_CTYPE": "UTF-8",
16+
"TZ": "Europe/Stockholm",
17+
"COLONIES_CLIENT_BACKENDS": "http",
18+
"COLONIES_CLIENT_HTTP_HOST": "colonies-server",
19+
"COLONIES_CLIENT_HTTP_PORT": "50080",
20+
"COLONIES_CLIENT_HTTP_INSECURE": "true",
21+
"COLONIES_SERVER_HOST": "colonies-server",
22+
"COLONIES_SERVER_PORT": "50080",
23+
"COLONIES_SERVER_TLS": "false",
24+
"COLONIES_TLS": "false",
25+
"COLONIES_COLONY_NAME": "dev",
26+
"COLONIES_COLONY_PRVKEY": "ba949fa134981372d6da62b6a56f336ab4d843b22c02a4257dcf7d0d73097514",
27+
"AWS_S3_TLS": "false",
28+
"AWS_S3_SKIPVERIFY": "false",
29+
"AWS_S3_ENDPOINT": "minio:9000",
30+
"AWS_S3_ACCESSKEY": "RrXN2vcLeHjBptG8a3Ay",
31+
"AWS_S3_SECRETKEY": "ivwLB0Luqomq65nNVmoo8fTBgxXgNvqYGC50VQN6",
32+
"AWS_S3_REGION_KEY": "",
33+
"AWS_S3_BUCKET": "colonies-prod",
34+
"EXECUTOR_ADD_DEBUG_LOGS": "false",
35+
"EXECUTOR_TYPE": "container-executor",
36+
"EXECUTOR_GPU": "0",
37+
"EXECUTOR_SW_NAME": "colonyos/dockerexecutor:v1.0.5",
38+
"EXECUTOR_SW_TYPE": "docker",
39+
"EXECUTOR_SW_VERSION": "colonyos/dockerexecutor:v1.0.5",
40+
"EXECUTOR_HW_CPU": "",
41+
"EXECUTOR_HW_MODEL": "n/a",
42+
"EXECUTOR_HW_NODES": "1",
43+
"EXECUTOR_HW_MEM": "",
44+
"EXECUTOR_HW_STORAGE": "",
45+
"EXECUTOR_HW_GPU_COUNT": "0",
46+
"EXECUTOR_HW_GPU_MEM": "",
47+
"EXECUTOR_HW_GPU_NODES_COUNT": "0",
48+
"EXECUTOR_HW_GPU_NAME": "",
49+
"EXECUTOR_LOCATION_LONG": "",
50+
"EXECUTOR_LOCATION_LAT": "",
51+
"EXECUTOR_LOCATION_DESC": "n/a",
52+
"EXECUTOR_FS_DIR": "/tmp/colonies"
53+
},
54+
"volumes": [
55+
{
56+
"host": "/var/run/docker.sock",
57+
"container": "/var/run/docker.sock"
58+
},
59+
{
60+
"host": "/tmp/colonies",
61+
"container": "/tmp/colonies"
62+
}
63+
],
64+
"privileged": true
65+
}
66+
}

examples/resources/executor-deployment-definition.json renamed to deployment/catalog/executor-deployment-definition.json

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -63,7 +63,7 @@
6363
"required": ["image", "executorType"]
6464
},
6565
"handler": {
66-
"executorType": "deployment-controller",
66+
"executorType": "docker-reconciler",
6767
"functionName": "reconcile"
6868
}
6969
}

deployment/cleanup.sh

Lines changed: 76 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,76 @@
1+
#!/bin/bash
2+
3+
# Cleanup script to remove all Docker containers, networks, and volumes
4+
# Use with caution - this will remove ALL Docker resources on your system
5+
6+
echo "========================================="
7+
echo "Docker Cleanup Script"
8+
echo "========================================="
9+
echo ""
10+
11+
# Stop all running containers
12+
echo "Stopping all running containers..."
13+
docker stop $(docker ps -aq) 2>/dev/null
14+
if [ $? -eq 0 ]; then
15+
echo "✓ All containers stopped"
16+
else
17+
echo "✓ No running containers found"
18+
fi
19+
echo ""
20+
21+
# Remove all containers
22+
echo "Removing all containers..."
23+
docker rm -f $(docker ps -aq) 2>/dev/null
24+
if [ $? -eq 0 ]; then
25+
echo "✓ All containers removed"
26+
else
27+
echo "✓ No containers to remove"
28+
fi
29+
echo ""
30+
31+
# Remove all networks (except default ones)
32+
echo "Removing all custom networks..."
33+
docker network prune -f 2>/dev/null
34+
if [ $? -eq 0 ]; then
35+
echo "✓ All custom networks removed"
36+
else
37+
echo "✓ No custom networks to remove"
38+
fi
39+
echo ""
40+
41+
# Remove all volumes
42+
echo "Removing all volumes..."
43+
docker volume prune -f 2>/dev/null
44+
if [ $? -eq 0 ]; then
45+
echo "✓ All volumes removed"
46+
else
47+
echo "✓ No volumes to remove"
48+
fi
49+
echo ""
50+
51+
# Remove all images (optional - uncomment if you want to remove images too)
52+
# echo "Removing all images..."
53+
# docker image prune -a -f 2>/dev/null
54+
# if [ $? -eq 0 ]; then
55+
# echo "✓ All images removed"
56+
# else
57+
# echo "✓ No images to remove"
58+
# fi
59+
# echo ""
60+
61+
# System-wide cleanup
62+
echo "Running system-wide cleanup..."
63+
docker system prune -a -f --volumes 2>/dev/null
64+
echo "✓ System cleanup complete"
65+
echo ""
66+
67+
echo "========================================="
68+
echo "Cleanup Complete!"
69+
echo "========================================="
70+
echo ""
71+
echo "Summary:"
72+
docker ps -a
73+
echo ""
74+
docker network ls
75+
echo ""
76+
docker volume ls

deployment/docker/start_server.sh

Lines changed: 0 additions & 14 deletions
This file was deleted.

0 commit comments

Comments
 (0)