This document describes the REST API for the GCO Manifest Processor service.
- Base URL
- Authentication
- CLI Quick Reference
- API Endpoints
- Detailed Endpoint Documentation
- Job Queue (DynamoDB-backed)
- Job Templates
- Webhooks
- Error Responses
- Examples
The API is available at the API Gateway endpoint configured during deployment:
https://<api-gateway-endpoint>/api/v1All API requests are authenticated using AWS IAM Signature Version 4 (SigV4) at the API Gateway level. The API Gateway validates your AWS credentials and forwards authenticated requests to the backend services.
Important: Never manually set the X-GCO-Auth-Token header. This internal header is automatically injected by the API Gateway proxy after successful SigV4 authentication.
# Using awscurl (recommended)
pip install awscurl
awscurl --service execute-api \
--region us-east-1 \
"https://<api-gateway-endpoint>/api/v1/jobs"
# Or using curl with AWS credentials
curl "https://<api-gateway-endpoint>/api/v1/jobs" \
--aws-sigv4 "aws:amz:us-east-1:execute-api" \
--user "$AWS_ACCESS_KEY_ID:$AWS_SECRET_ACCESS_KEY"import requests
from botocore.auth import SigV4Auth
from botocore.awsrequest import AWSRequest
import boto3
session = boto3.Session()
credentials = session.get_credentials()
request = AWSRequest(method='GET', url='https://<api-gateway-endpoint>/api/v1/jobs')
SigV4Auth(credentials, 'execute-api', 'us-east-1').add_auth(request)
response = requests.get(request.url, headers=dict(request.headers))See Client Examples for more detailed examples.
The gco CLI is the recommended way to interact with the API. Install it with:
pip install -e .Common commands:
# Job management
gco jobs submit job.yaml --region us-east-1
gco jobs list --region us-east-1
gco jobs list --all-regions
gco jobs get my-job --region us-east-1
gco jobs logs my-job --region us-east-1
gco jobs delete my-job --region us-east-1
# Global job queue (DynamoDB-backed)
gco queue submit job.yaml --region us-east-1
gco queue list --status queued
gco queue get <job-id>
gco queue cancel <job-id>
gco queue stats
# Templates
gco templates list
gco templates create job.yaml --name my-template
gco templates run my-template --name my-job --region us-east-1
# Webhooks
gco webhooks list
gco webhooks create --url https://example.com/hook -e job.completed
gco webhooks delete <webhook-id>| Method | Endpoint | Description |
|---|---|---|
| GET | /healthz |
Kubernetes liveness probe |
| GET | /readyz |
Kubernetes readiness probe |
| GET | /api/v1/health |
Detailed health check |
| GET | /api/v1/status |
Service status and configuration |
These endpoints query all regional clusters in parallel and return aggregated results.
| Method | Endpoint | Description |
|---|---|---|
| GET | /api/v1/global/jobs |
List jobs across all regions |
| DELETE | /api/v1/global/jobs |
Bulk delete jobs across all regions |
| GET | /api/v1/global/health |
Health status across all regions |
| GET | /api/v1/global/status |
Cluster status across all regions |
| Method | Endpoint | Description |
|---|---|---|
| POST | /api/v1/manifests |
Submit manifests for processing |
| POST | /api/v1/manifests/validate |
Validate manifests without applying |
| GET | /api/v1/manifests/{ns}/{name} |
Get resource status |
| DELETE | /api/v1/manifests/{ns}/{name} |
Delete a resource |
| Method | Endpoint | Description | CLI Command |
|---|---|---|---|
| GET | /api/v1/jobs |
List jobs with pagination | gco jobs list -r REGION |
| GET | /api/v1/jobs/{ns}/{name} |
Get job details | gco jobs get NAME -r REGION |
| GET | /api/v1/jobs/{ns}/{name}/logs |
Get job logs | gco jobs logs NAME -r REGION |
| GET | /api/v1/jobs/{ns}/{name}/events |
Get job events | gco jobs events NAME -r REGION |
| GET | /api/v1/jobs/{ns}/{name}/pods |
Get job pods | gco jobs pods NAME -r REGION |
| GET | /api/v1/jobs/{ns}/{name}/pods/{pod}/logs |
Get specific pod logs | gco jobs pod-logs NAME POD -r REGION |
| GET | /api/v1/jobs/{ns}/{name}/metrics |
Get job resource metrics | gco jobs metrics NAME -r REGION |
| DELETE | /api/v1/jobs/{ns}/{name} |
Delete a job | gco jobs delete NAME -r REGION |
| DELETE | /api/v1/jobs |
Bulk delete jobs | gco jobs bulk-delete -r REGION |
| POST | /api/v1/jobs/{ns}/{name}/retry |
Retry a failed job | gco jobs retry NAME -r REGION |
The job queue provides centralized job submission with region targeting via DynamoDB.
| Method | Endpoint | Description | CLI Command |
|---|---|---|---|
| POST | /api/v1/queue/jobs |
Submit job to global queue | gco queue submit FILE -r REGION |
| GET | /api/v1/queue/jobs |
List queued jobs | gco queue list |
| GET | /api/v1/queue/jobs/{id} |
Get queued job details | gco queue get ID |
| DELETE | /api/v1/queue/jobs/{id} |
Cancel a queued job | gco queue cancel ID |
| GET | /api/v1/queue/stats |
Queue statistics | gco queue stats |
| POST | /api/v1/queue/poll |
Poll and process jobs (internal) | - |
| Method | Endpoint | Description | CLI Command |
|---|---|---|---|
| GET | /api/v1/templates |
List job templates | gco templates list |
| POST | /api/v1/templates |
Create a job template | gco templates create FILE -n NAME |
| GET | /api/v1/templates/{name} |
Get a template | gco templates get NAME |
| DELETE | /api/v1/templates/{name} |
Delete a template | gco templates delete NAME |
| POST | /api/v1/jobs/from-template/{name} |
Create job from template | gco templates run NAME -n JOB -r REGION |
| Method | Endpoint | Description | CLI Command |
|---|---|---|---|
| GET | /api/v1/webhooks |
List webhooks | gco webhooks list |
| POST | /api/v1/webhooks |
Register a webhook | gco webhooks create -u URL -e EVENT |
| DELETE | /api/v1/webhooks/{id} |
Delete a webhook | gco webhooks delete ID |
GET /api/v1/global/jobsList jobs across ALL regional clusters. This endpoint queries all regional ALBs in parallel and aggregates the results.
CLI:
gco jobs list --all-regions
gco jobs list -a --namespace gco-jobs --status runningQuery Parameters:
| Parameter | Type | Default | Description |
|---|---|---|---|
namespace |
string | - | Filter by namespace |
status |
string | - | Filter by status |
limit |
int | 50 | Maximum jobs to return |
Response:
{
"total": 150,
"count": 50,
"limit": 50,
"regions_queried": 3,
"regions_successful": 3,
"region_summaries": [
{"region": "us-east-1", "count": 25, "total": 80},
{"region": "us-west-2", "count": 15, "total": 45},
{"region": "eu-west-1", "count": 10, "total": 25}
],
"jobs": [
{
"metadata": {"name": "job-1", "namespace": "gco-jobs"},
"_source_region": "us-east-1",
"computed_status": "running"
}
],
"errors": null
}GET /api/v1/global/healthGet health status across all regional clusters.
CLI:
gco jobs health --all-regionsResponse:
{
"overall_status": "healthy",
"healthy_regions": 3,
"total_regions": 3,
"regions": [
{
"region": "us-east-1",
"status": "healthy",
"cluster_id": "gco-us-east-1"
},
{
"region": "us-west-2",
"status": "healthy",
"cluster_id": "gco-us-west-2"
}
]
}DELETE /api/v1/global/jobsBulk delete jobs across all regional clusters.
CLI:
gco jobs bulk-delete --all-regions --status failed --older-than-days 30 --executeRequest Body:
{
"namespace": "gco-jobs",
"status": "completed",
"older_than_days": 7,
"dry_run": true
}Response:
{
"dry_run": false,
"total_matched": 25,
"total_deleted": 25,
"regions_queried": 3,
"region_results": [
{"region": "us-east-1", "matched": 15, "deleted": 15, "failed": 0},
{"region": "us-west-2", "matched": 10, "deleted": 10, "failed": 0}
],
"errors": null
}GET /api/v1/jobsList Kubernetes Jobs with pagination and filtering.
CLI:
# List jobs in a specific region
gco jobs list --region us-east-1
# List jobs across all regions
gco jobs list --all-regions
# Filter by namespace and status
gco jobs list -r us-west-2 -n gco-jobs --status running
# Limit results
gco jobs list -r us-east-1 --limit 10Query Parameters:
| Parameter | Type | Default | Description |
|---|---|---|---|
namespace |
string | - | Filter by namespace |
status |
string | - | Filter by status (pending, running, completed, succeeded, failed) |
limit |
int | 50 | Maximum jobs to return (1-1000) |
offset |
int | 0 | Number of jobs to skip |
sort |
string | createdAt:desc | Sort field and order (field:asc|desc) |
label_selector |
string | - | Kubernetes label selector (e.g., app=test) |
Response:
{
"cluster_id": "gco-cluster",
"region": "us-east-1",
"timestamp": "2024-01-15T10:30:00Z",
"total": 100,
"limit": 50,
"offset": 0,
"has_more": true,
"count": 50,
"jobs": [
{
"metadata": {
"name": "training-job-001",
"namespace": "gco-jobs",
"creationTimestamp": "2024-01-15T10:00:00Z",
"labels": {"app": "ml-training"},
"uid": "abc123"
},
"spec": {
"parallelism": 1,
"completions": 1,
"backoffLimit": 6
},
"status": {
"active": 1,
"succeeded": 0,
"failed": 0,
"startTime": "2024-01-15T10:00:05Z",
"completionTime": null,
"conditions": []
},
"computed_status": "running"
}
]
}GET /api/v1/jobs/{namespace}/{name}/logsGet logs from a Job's pods with multi-container support.
CLI:
# Get logs from a job
gco jobs logs my-job --region us-east-1
# Get more lines
gco jobs logs training-job -r us-west-2 -n ml-jobs --tail 500
# Get logs from specific container
gco jobs logs multi-container-job -r us-east-1 --container sidecarQuery Parameters:
| Parameter | Type | Default | Description |
|---|---|---|---|
container |
string | - | Container name (for multi-container pods) |
tail |
int | 100 | Number of lines from the end (1-10000) |
previous |
bool | false | Get logs from previous terminated container |
since_seconds |
int | - | Only return logs newer than N seconds |
timestamps |
bool | false | Include timestamps in log lines |
Response:
{
"cluster_id": "gco-cluster",
"region": "us-east-1",
"timestamp": "2024-01-15T10:30:00Z",
"job_name": "training-job-001",
"namespace": "gco-jobs",
"pod_name": "training-job-001-abc123",
"container": "main",
"available_containers": ["main", "sidecar"],
"init_containers": ["init-data"],
"previous": false,
"tail_lines": 100,
"logs": "2024-01-15 10:00:05 Starting training...\n2024-01-15 10:00:10 Epoch 1/10..."
}GET /api/v1/jobs/{namespace}/{name}/eventsGet Kubernetes events related to a Job and its pods.
CLI:
gco jobs events my-job --region us-east-1
gco jobs events training-job -n ml-jobs -r us-west-2Response:
{
"cluster_id": "gco-cluster",
"region": "us-east-1",
"timestamp": "2024-01-15T10:30:00Z",
"job_name": "training-job-001",
"namespace": "gco-jobs",
"count": 3,
"events": [
{
"type": "Normal",
"reason": "SuccessfulCreate",
"message": "Created pod: training-job-001-abc123",
"count": 1,
"firstTimestamp": "2024-01-15T10:00:00Z",
"lastTimestamp": "2024-01-15T10:00:00Z",
"source": {
"component": "job-controller",
"host": null
},
"involvedObject": {
"kind": "Job",
"name": "training-job-001",
"namespace": "gco-jobs"
}
}
]
}GET /api/v1/jobs/{namespace}/{name}/podsGet detailed information about all pods created by a Job.
CLI:
gco jobs pods my-job -r us-east-1
gco jobs pods training-job -n ml-jobs -r us-west-2Response:
{
"cluster_id": "gco-cluster",
"region": "us-east-1",
"timestamp": "2024-01-15T10:30:00Z",
"job_name": "training-job-001",
"namespace": "gco-jobs",
"count": 1,
"pods": [
{
"metadata": {
"name": "training-job-001-abc123",
"namespace": "gco-jobs",
"creationTimestamp": "2024-01-15T10:00:00Z",
"labels": {"job-name": "training-job-001"},
"uid": "pod-uid-123"
},
"spec": {
"nodeName": "ip-10-0-1-100.ec2.internal",
"containers": [{"name": "main", "image": "pytorch:latest"}],
"initContainers": []
},
"status": {
"phase": "Running",
"hostIP": "10.0.1.100",
"podIP": "10.0.2.50",
"startTime": "2024-01-15T10:00:05Z",
"containerStatuses": [
{
"name": "main",
"ready": true,
"restartCount": 0,
"image": "pytorch:latest",
"state": "running",
"startedAt": "2024-01-15T10:00:10Z"
}
],
"initContainerStatuses": []
}
}
]
}GET /api/v1/jobs/{namespace}/{name}/metricsGet resource usage metrics for a Job's pods (requires metrics-server).
CLI:
gco jobs metrics my-job --region us-east-1
gco jobs metrics training-job -n ml-jobs -r us-west-2Response:
{
"cluster_id": "gco-cluster",
"region": "us-east-1",
"timestamp": "2024-01-15T10:30:00Z",
"job_name": "training-job-001",
"namespace": "gco-jobs",
"summary": {
"total_cpu_millicores": 2500,
"total_memory_bytes": 4294967296,
"total_memory_mib": 4096.0,
"pod_count": 1
},
"pods": [
{
"pod_name": "training-job-001-abc123",
"containers": [
{
"name": "main",
"cpu_millicores": 2500,
"memory_bytes": 4294967296,
"memory_mib": 4096.0
}
]
}
]
}DELETE /api/v1/jobsBulk delete jobs based on filters.
CLI:
# Dry run (preview what would be deleted)
gco jobs bulk-delete --region us-east-1 --status completed --older-than-days 7
# Actually delete (use --execute)
gco jobs bulk-delete -r us-west-2 -n gco-jobs -s failed --execute -y
# Delete across all regions
gco jobs bulk-delete --all-regions --status failed --older-than-days 30 --executeRequest Body:
{
"namespace": "gco-jobs",
"status": "completed",
"older_than_days": 7,
"label_selector": "app=test",
"dry_run": true
}| Field | Type | Description |
|---|---|---|
namespace |
string | Filter by namespace |
status |
string | Filter by status |
older_than_days |
int | Delete jobs older than N days (1-365) |
label_selector |
string | Kubernetes label selector |
dry_run |
bool | If true, only return what would be deleted |
Response:
{
"cluster_id": "gco-cluster",
"region": "us-east-1",
"timestamp": "2024-01-15T10:30:00Z",
"dry_run": false,
"total_matched": 5,
"deleted_count": 5,
"failed_count": 0,
"jobs": [
{"name": "old-job-1", "namespace": "gco-jobs"},
{"name": "old-job-2", "namespace": "gco-jobs"}
],
"failed": null
}POST /api/v1/jobs/{namespace}/{name}/retryRetry a failed job by creating a new job from its spec.
CLI:
gco jobs retry failed-job --region us-east-1
gco jobs retry training-job -n ml-jobs -r us-west-2 -yResponse:
{
"cluster_id": "gco-cluster",
"region": "us-east-1",
"timestamp": "2024-01-15T10:30:00Z",
"original_job": "failed-job-001",
"new_job": "failed-job-001-retry-20240115103000",
"namespace": "gco-jobs",
"success": true,
"message": "Job retry created successfully",
"errors": []
}The job queue provides centralized job submission with region targeting. Jobs are stored in DynamoDB and picked up by regional manifest processors, enabling:
- Global job submission from any region
- Centralized job tracking and status updates
- Priority-based job scheduling
- Full job history and audit trail
POST /api/v1/queue/jobsSubmit a job to the global queue for regional pickup.
CLI:
# Submit job targeting us-east-1
gco queue submit job.yaml --region us-east-1
# Submit with priority
gco queue submit job.yaml -r us-west-2 --priority 50
# Submit with labels
gco queue submit job.yaml -r us-east-1 -l team=ml -l project=trainingRequest Body:
{
"manifest": {
"apiVersion": "batch/v1",
"kind": "Job",
"metadata": {"name": "my-training-job"},
"spec": {
"template": {
"spec": {
"containers": [{"name": "train", "image": "pytorch:latest"}],
"restartPolicy": "Never"
}
}
}
},
"target_region": "us-east-1",
"namespace": "gco-jobs",
"priority": 10,
"labels": {"team": "ml"}
}Response:
{
"timestamp": "2024-01-15T10:30:00Z",
"message": "Job queued successfully",
"job": {
"job_id": "abc123-def456-ghi789",
"job_name": "my-training-job",
"target_region": "us-east-1",
"namespace": "gco-jobs",
"status": "queued",
"priority": 10,
"submitted_at": "2024-01-15T10:30:00Z"
}
}GET /api/v1/queue/jobsList jobs in the global queue with optional filters.
CLI:
# List all queued jobs
gco queue list
# Filter by region and status
gco queue list --region us-east-1 --status queued
# Filter by status only
gco queue list -s runningQuery Parameters:
| Parameter | Type | Description |
|---|---|---|
target_region |
string | Filter by target region |
status |
string | Filter by status (queued, claimed, running, succeeded, failed, cancelled) |
namespace |
string | Filter by namespace |
limit |
int | Maximum results (default: 100) |
Response:
{
"timestamp": "2024-01-15T10:30:00Z",
"count": 5,
"jobs": [
{
"job_id": "abc123-def456",
"job_name": "training-job-1",
"target_region": "us-east-1",
"namespace": "gco-jobs",
"status": "queued",
"priority": 10,
"submitted_at": "2024-01-15T10:00:00Z"
}
]
}GET /api/v1/queue/jobs/{job_id}Get details of a specific queued job including full status history.
CLI:
gco queue get abc123-def456
gco queue get abc123-def456 --region us-east-1Response:
{
"timestamp": "2024-01-15T10:30:00Z",
"job": {
"job_id": "abc123-def456",
"job_name": "training-job-1",
"target_region": "us-east-1",
"namespace": "gco-jobs",
"status": "running",
"priority": 10,
"manifest": {"apiVersion": "batch/v1", "...": "..."},
"labels": {"team": "ml"},
"submitted_at": "2024-01-15T10:00:00Z",
"claimed_by": "us-east-1",
"claimed_at": "2024-01-15T10:00:05Z",
"k8s_job_uid": "k8s-uid-123",
"status_history": [
{"status": "queued", "timestamp": "2024-01-15T10:00:00Z", "message": "Job submitted"},
{"status": "claimed", "timestamp": "2024-01-15T10:00:05Z"},
{"status": "applying", "timestamp": "2024-01-15T10:00:06Z"},
{"status": "running", "timestamp": "2024-01-15T10:00:10Z"}
]
}
}DELETE /api/v1/queue/jobs/{job_id}Cancel a queued job. Only works for jobs in queued or claimed status.
CLI:
gco queue cancel abc123-def456
gco queue cancel abc123-def456 --reason "No longer needed"Query Parameters:
| Parameter | Type | Description |
|---|---|---|
reason |
string | Optional cancellation reason |
Response:
{
"timestamp": "2024-01-15T10:30:00Z",
"message": "Job 'abc123-def456' cancelled successfully"
}GET /api/v1/queue/statsGet job queue statistics grouped by region and status.
CLI:
gco queue statsResponse:
{
"timestamp": "2024-01-15T10:30:00Z",
"summary": {
"total_jobs": 150,
"total_queued": 10,
"total_running": 25
},
"by_region": {
"us-east-1": {
"queued": 5,
"running": 15,
"succeeded": 50,
"failed": 3
},
"us-west-2": {
"queued": 5,
"running": 10,
"succeeded": 40,
"failed": 2
}
}
}Templates allow you to define reusable job configurations with parameter substitution.
GET /api/v1/templatesCLI:
gco templates listPOST /api/v1/templatesCLI:
# Create from manifest file
gco templates create job.yaml --name gpu-training-template -d "GPU training template"
# With default parameters
gco templates create job.yaml -n my-template -p image=pytorch:latest -p gpus=4Request Body:
{
"name": "gpu-training-template",
"description": "Template for GPU training jobs",
"manifest": {
"apiVersion": "batch/v1",
"kind": "Job",
"metadata": {"name": "{{name}}"},
"spec": {
"template": {
"spec": {
"containers": [{
"name": "train",
"image": "{{image}}",
"resources": {
"limits": {"nvidia.com/gpu": "{{gpu_count}}"}
}
}],
"restartPolicy": "Never"
}
}
}
},
"parameters": {
"image": "pytorch/pytorch:latest",
"gpu_count": "1"
}
}POST /api/v1/jobs/from-template/{name}CLI:
# Create job from template
gco templates run gpu-training-template --name my-job --region us-east-1
# With parameter overrides
gco templates run gpu-template -n my-job -r us-east-1 -p image=custom:v1 -p gpus=8Request Body:
{
"name": "my-training-job",
"namespace": "gco-jobs",
"parameters": {
"image": "my-custom-image:v1",
"gpu_count": "4"
}
}Response:
{
"cluster_id": "gco-cluster",
"region": "us-east-1",
"timestamp": "2024-01-15T10:30:00Z",
"template": "gpu-training-template",
"job_name": "my-training-job",
"namespace": "gco-jobs",
"success": true,
"parameters_applied": {
"name": "my-training-job",
"image": "my-custom-image:v1",
"gpu_count": "4"
},
"errors": []
}Webhooks allow you to receive notifications when job events occur. The webhook dispatcher monitors Kubernetes jobs for status changes and sends HTTP POST requests to registered webhook endpoints.
When a job event occurs (started, completed, or failed), the webhook dispatcher:
- Detects the job status transition
- Queries matching webhooks from DynamoDB based on event type and namespace
- Sends HTTP POST requests to all matching webhook URLs
- Retries failed deliveries with exponential backoff (up to 3 attempts)
All webhook deliveries use the following JSON payload format:
{
"event": "job.completed",
"timestamp": "2024-01-15T12:00:00Z",
"cluster_id": "gco-cluster-us-east-1",
"region": "us-east-1",
"job": {
"name": "my-training-job",
"namespace": "gco-jobs",
"uid": "abc-123-def-456",
"labels": {"app": "ml-training", "team": "data-science"},
"status": "succeeded",
"start_time": "2024-01-15T11:55:00Z",
"completion_time": "2024-01-15T12:00:00Z",
"active": 0,
"succeeded": 1,
"failed": 0
}
}Each webhook request includes the following headers:
| Header | Description |
|---|---|
Content-Type |
application/json |
User-Agent |
GCO-Webhook/<cluster-id> |
X-GCO-Event |
Event type (e.g., job.completed) |
X-GCO-Cluster |
Cluster ID |
X-GCO-Region |
AWS region |
X-GCO-Signature |
HMAC-SHA256 signature (if secret configured) |
When a webhook has a secret configured, the payload is signed using HMAC-SHA256. The signature is included in the X-GCO-Signature header as sha256=<hex_digest>.
To verify the signature in your webhook handler:
import hmac
import hashlib
def verify_signature(payload: bytes, signature: str, secret: str) -> bool:
expected = hmac.new(
secret.encode('utf-8'),
payload,
hashlib.sha256
).hexdigest()
return hmac.compare_digest(f"sha256={expected}", signature)- 5xx errors: Retried up to 3 times with exponential backoff (5s, 10s, 20s)
- 4xx errors: Not retried (client error)
- Timeouts: Retried up to 3 times (default timeout: 30 seconds)
- Connection errors: Retried up to 3 times
GET /api/v1/webhooksCLI:
gco webhooks list
gco webhooks list --namespace gco-jobsPOST /api/v1/webhooksCLI:
# Register webhook for job events
gco webhooks create --url https://example.com/webhook -e job.completed -e job.failed
# Filter by namespace
gco webhooks create -u https://slack.com/webhook -e job.failed -n gco-jobs
# With HMAC secret for signature verification
gco webhooks create -u https://example.com/webhook -e job.completed --secret my-secret-keyRequest Body:
{
"url": "https://example.com/webhook",
"events": ["job.completed", "job.failed", "job.started"],
"namespace": "gco-jobs",
"secret": "optional-hmac-secret"
}Available Events:
job.started- Job started running (transitioned from pending to running)job.completed- Job completed successfullyjob.failed- Job failed
Response:
{
"timestamp": "2024-01-15T10:30:00Z",
"message": "Webhook registered successfully",
"webhook": {
"id": "abc12345",
"url": "https://example.com/webhook",
"events": ["job.completed", "job.failed"],
"namespace": "gco-jobs"
}
}DELETE /api/v1/webhooks/{id}CLI:
gco webhooks delete abc12345
gco webhooks delete abc12345 -y # Skip confirmationAll error responses follow this format:
{
"error": "Error type",
"detail": "Detailed error message",
"timestamp": "2024-01-15T10:30:00Z"
}Common HTTP Status Codes:
| Code | Description |
|---|---|
| 200 | Success |
| 201 | Created |
| 400 | Bad Request - Invalid input |
| 403 | Forbidden - Namespace not allowed |
| 404 | Not Found - Resource doesn't exist |
| 409 | Conflict - Resource already exists |
| 500 | Internal Server Error |
| 503 | Service Unavailable - Processor not ready |
The gco CLI handles authentication automatically using your AWS credentials.
# Submit a job
gco jobs submit job.yaml --region us-east-1
# Submit to global queue (DynamoDB-backed)
gco queue submit job.yaml --region us-east-1 --priority 10
# List jobs across all regions
gco jobs list --all-regions
# Get job logs
gco jobs logs my-job --region us-east-1 --tail 500
# Create and use a template
gco templates create job.yaml --name gpu-template -d "GPU training"
gco templates run gpu-template --name my-job --region us-east-1
# Register a webhook
gco webhooks create --url https://example.com/hook -e job.completed -e job.failed
# Check queue statistics
gco queue stats
# Bulk delete old completed jobs
gco jobs bulk-delete --all-regions --status completed --older-than-days 7 --executeAll examples below use awscurl for SigV4 authentication. Install with pip install awscurl.
awscurl --service execute-api --region us-east-1 \
-X POST "https://<api-gateway-endpoint>/api/v1/manifests" \
-H "Content-Type: application/json" \
-d '{
"manifests": [{
"apiVersion": "batch/v1",
"kind": "Job",
"metadata": {
"name": "my-job",
"namespace": "gco-jobs"
},
"spec": {
"template": {
"spec": {
"containers": [{
"name": "main",
"image": "python:3.11",
"command": ["python", "-c", "print(\"Hello World\")"]
}],
"restartPolicy": "Never"
}
}
}
}]
}'awscurl --service execute-api --region us-east-1 \
"https://<api-gateway-endpoint>/api/v1/jobs?namespace=gco-jobs&limit=10&offset=0&status=running"awscurl --service execute-api --region us-east-1 \
"https://<api-gateway-endpoint>/api/v1/jobs/gco-jobs/my-job/logs?container=main&tail=500×tamps=true"awscurl --service execute-api --region us-east-1 \
-X DELETE "https://<api-gateway-endpoint>/api/v1/jobs" \
-H "Content-Type: application/json" \
-d '{
"namespace": "gco-jobs",
"status": "completed",
"older_than_days": 7,
"dry_run": false
}'# Create template
awscurl --service execute-api --region us-east-1 \
-X POST "https://<api-gateway-endpoint>/api/v1/templates" \
-H "Content-Type: application/json" \
-d '{
"name": "python-job",
"manifest": {
"apiVersion": "batch/v1",
"kind": "Job",
"metadata": {"name": "{{name}}"},
"spec": {
"template": {
"spec": {
"containers": [{"name": "main", "image": "{{image}}", "command": {{command}}}],
"restartPolicy": "Never"
}
}
}
},
"parameters": {"image": "python:3.11"}
}'
# Create job from template
awscurl --service execute-api --region us-east-1 \
-X POST "https://<api-gateway-endpoint>/api/v1/jobs/from-template/python-job" \
-H "Content-Type: application/json" \
-d '{
"name": "my-python-job",
"namespace": "gco-jobs",
"parameters": {"command": "[\"python\", \"-c\", \"print(1+1)\"]"}
}'