When adding a new service to AIXCL, you must update two files to ensure the service is properly managed and started by the stack.
Define the service configuration:
new-service:
image: org/image:tag
container_name: new-service
pull_policy: if_not_present # REQUIRED - aligns with other services
volumes:
- new-service-data:/data
network_mode: host
restart: unless-stopped
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:port/health"]
interval: 30s
timeout: 10s
retries: 3Security hardening (when applicable):
cap_drop:
- ALL
cap_add: # Only if entrypoint needs privilege dropping
- SETUID
- SETGID
security_opt:
- no-new-privileges:true
read_only: true # Only if service supports it
tmpfs:
- /tmp:noexec,nosuid,size=50mAdd the service to all applicable profile mappings in get_profile_services_for_profile():
get_profile_services_for_profile() {
local profile="$1"
local engine="${INFERENCE_ENGINE:-ollama}"
case "$profile" in
usr)
echo "$engine postgres"
;;
dev)
echo "$engine open-webui postgres pgadmin"
;;
ops)
echo "$engine postgres prometheus grafana loki cadvisor node-exporter postgres-exporter nvidia-gpu-exporter NEW-SERVICE"
;;
sys)
echo "$engine open-webui postgres pgadmin prometheus alertmanager grafana loki cadvisor node-exporter postgres-exporter nvidia-gpu-exporter NEW-SERVICE"
;;
esac
}Also update the deprecated PROFILE_SERVICES array for backward compatibility:
declare -A PROFILE_SERVICES=(
[usr]="INFERENCE_ENGINE_PLACEHOLDER postgres"
[dev]="INFERENCE_ENGINE_PLACEHOLDER open-webui postgres pgadmin"
[ops]="INFERENCE_ENGINE_PLACEHOLDER postgres prometheus grafana loki cadvisor node-exporter postgres-exporter nvidia-gpu-exporter NEW-SERVICE"
[sys]="INFERENCE_ENGINE_PLACEHOLDER open-webui postgres pgadmin prometheus alertmanager grafana loki cadvisor node-exporter postgres-exporter nvidia-gpu-exporter NEW-SERVICE"
)| Profile | Purpose | Include Service If... |
|---|---|---|
| usr | Minimal footprint | Required for basic runtime (e.g., postgres for persistence) |
| dev | Developer workstation | Developer/admin tool (e.g., pgadmin, Open WebUI) |
| ops | Observability | Monitoring/logging service (e.g., prometheus, grafana, cadvisor) |
| sys | Complete stack | All services except excluded (privileged or incompatible) |
Before submitting a PR, verify:
- Service defined in
services/docker-compose.yml -
pull_policy: if_not_presentincluded - Health check configured
- Security hardening applied (if compatible)
- Service added to
get_profile_services_for_profile()for each applicable profile - Service added to
PROFILE_SERVICESarray (backward compatibility) - Volume created in
volumes:section (if needed) - Test with
./aixcl stack start --profile sys- service starts - Test with
./aixcl stack status- service shows as healthy - Test with
/platformor./aixcl stack status- health endpoint responds
- Forgetting profile mappings - Service defined in docker-compose but never started
- Inconsistent profiles - Added to some profiles but not all applicable ones
- Missing volumes - Service expects volume not defined in volumes section
- No health check - Service runs but cannot be verified healthy
docker-compose.yml:
alertmanager:
image: prom/alertmanager:v0.28.0
container_name: alertmanager
pull_policy: if_not_present
cap_drop:
- ALL
security_opt:
- no-new-privileges:true
read_only: true
tmpfs:
- /tmp:noexec,nosuid,size=100m
volumes:
- ../prometheus/alertmanager.yml:/etc/alertmanager/alertmanager.yml:ro
- alertmanager-data:/alertmanager
command:
- '--config.file=/etc/alertmanager/alertmanager.yml'
- '--storage.path=/alertmanager'
- '--web.listen-address=127.0.0.1:9093'
network_mode: host
restart: unless-stopped
healthcheck:
test: ["CMD", "wget", "--no-verbose", "--tries=1", "--spider", "http://127.0.0.1:9093/-/healthy"]
interval: 30s
timeout: 10s
retries: 3lib/cli/profile.sh (sys profile):
sys)
echo "$engine open-webui postgres pgadmin prometheus alertmanager grafana loki cadvisor node-exporter postgres-exporter nvidia-gpu-exporter"
;;volumes section in docker-compose.yml:
volumes:
# ... other volumes ...
alertmanager-data:- Profiles Documentation - Profile definitions and service composition
- Service Contracts - Service dependency rules
- Security Hardening - Container security controls