Durable Skies

Durable multi-agent drone delivery demo built on Google ADK and Temporal: a fleet of four autonomous drones executes delivery missions under the supervision of LLM-powered agents (Anthropic Claude), with every LLM call and every tool invocation running as a durable Temporal Activity — crashes, restarts, and deploys never lose state mid-mission.

app.mp4

Features

Durable agents — every ADK LLM call and tool call is recorded as a Temporal Activity, so agent reasoning is replayed deterministically after any crash.
Agents at decision points — a dispatcher agent picks the best drone for each incoming order, and an anomaly handler agent chooses a recovery action when an in-flight incident occurs. The mission itself is a deterministic activity loop.
Entity-per-drone orchestration — a FleetWorkflow supervisor routes orders to long-lived per-drone DroneWorkflow entities, each spawning a DeliveryWorkflow child per order. A per-order OrderWorkflow makes every order individually queryable in the Temporal UI.
Claude via LiteLLM — the Google ADK talks to Anthropic's Claude models through the LiteLLM adapter: Sonnet for decision-makers, Haiku for the dispatcher's analyst sub-agents.
Live operations frontend — a Nuxt 4 dashboard with the fleet map, a per-drone agent panel, and a streaming event log.
One-command local stack — a Compose file brings up a Temporal dev-server container (serving both the gRPC frontend and the built-in Web UI) alongside a Redis container used for live drone telemetry, the fleet event log, and the drone availability registry; make targets start the worker, the API, and the frontend.

Prerequisites

Docker (or a Compose-compatible runtime such as Podman) for the local stack
An ANTHROPIC_API_KEY for Claude

Getting Started

Clone the repo, set your API key, and launch the full stack with Compose:

git clone https://github.com/alexandreroman/durable-skies.git
cd durable-skies

cp .env.example .env
# Edit .env and set ANTHROPIC_API_KEY=sk-ant-...

make run   # or: docker-compose up

This brings up Temporal, Redis, the worker, the API, and the Nuxt frontend in one shot.

Open http://localhost:3000 and click Submit Orders to see a drone mission run end-to-end.

Usage

Submit an order programmatically. Valid pickup bases are base-north, base-south, and base-east; valid delivery points are dp-1 through dp-8:

curl -X POST http://localhost:8000/orders \
  -H 'Content-Type: application/json' \
  -d '{
    "id": "order-001",
    "pickup_base_id": "base-north",
    "dropoff_point_id": "dp-1",
    "payload_kg": 1.2,
    "created_at": "2026-04-22T10:00:00Z",
    "status": "pending"
  }'

Inspect workflows and activities in the Temporal UI at http://localhost:8233 — each agent step shows up as a workflow or activity you can replay.

Configuration

Settings are read from environment variables or from a .env file at the project root. All fields have sensible defaults; only ANTHROPIC_API_KEY is required.

Variable	Description	Default
`ANTHROPIC_API_KEY`	Anthropic API key (required)	—
`TEMPORAL_ADDRESS`	Temporal frontend host:port	`localhost:7233`
`TEMPORAL_NAMESPACE`	Temporal namespace	`default`
`REDIS_URL`	Redis URL for telemetry, events, availability	`redis://localhost:6379/0`
`ANTHROPIC_MODEL`	Claude model for decision-making agents	`anthropic/claude-sonnet-4-6`
`ANTHROPIC_FAST_MODEL`	Claude model for summarizer sub-agents	`anthropic/claude-haiku-4-5`
`API_HOST`	FastAPI bind address	`0.0.0.0`
`API_PORT`	FastAPI listen port	`8000`

The Nuxt frontend reads NUXT_PUBLIC_API_BASE (default http://localhost:8000); set it if you serve the API on a different host.

Development

For iterative work with hot-reload, run the backend and frontend directly on your host against the Compose-managed Temporal and Redis:

make -C backend install   # install Python deps
make infra-up             # start Temporal + Redis only
make dev                  # worker + API + frontend

make dev runs the worker, the API, and the frontend with hot-reload in one shot. You can also run make worker, make api, and make ui in separate terminals if you prefer.

This flow additionally requires Python 3.12+, uv, Node.js 20+, and pnpm on your host.

Architecture

graph TD
    FE[Frontend<br/>map · agent panel · event log]
    API[Backend]
    ORDER[OrderWorkflow<br/>per-order]
    FLEET[FleetWorkflow<br/>dispatcher]
    DRONE[DroneWorkflow<br/>per-drone entity]
    DELIV[DeliveryWorkflow<br/>per-order child]
    DISP[ADK Dispatcher Agent]
    ANOM[ADK Anomaly Agent]
    ACTS[Drone + world<br/>activities]
    TEMPORAL[(Temporal Service)]
    REDIS[(Redis)]
    CLAUDE[(Anthropic API)]

    FE <--> API
    API -->|signal / query| TEMPORAL
    API -->|read telemetry| REDIS
    TEMPORAL --> ORDER
    TEMPORAL --> FLEET
    TEMPORAL --> DRONE
    ORDER -->|signal order| FLEET
    FLEET --> DISP
    FLEET -->|signal| DRONE
    DRONE -->|child workflow| DELIV
    DELIV --> ANOM
    DELIV --> ACTS
    DISP -->|TemporalModel| CLAUDE
    ANOM -->|TemporalModel| CLAUDE
    DRONE -->|availability| REDIS
    ACTS -->|telemetry + events| REDIS

Module	Description
`backend`	Python package with the FastAPI HTTP API, Temporal workflows, activities, and ADK dispatcher + anomaly agents.
`frontend`	Nuxt 4 + Vue 3 + Tailwind 4 dashboard for monitoring the fleet.

Agents

Two ADK agents sit at the decision points of the fleet. Everything else — takeoff, navigation, pickup, dropoff, landing — runs as a deterministic Temporal activity loop with no LLM in the critical path.

Both agents run through the ADK × Temporal integration: every LLM call goes through TemporalModel, so each model invocation is recorded as a Temporal Activity and replayed deterministically after a crash. Each activity carries a human-readable summary (for example Dispatcher · Fleet analyst) so agent steps show up labelled in the Temporal UI.

The tools the agents expose — submit_dispatch and submit_recovery — are pure in-memory writes to ADK session state; they are not wrapped as activities because they carry no side effects. The workflow reads the decision back from session state after the agent run and branches on a validated string.

Dispatcher

Picks the best idle drone for each pending order. Invoked from FleetWorkflow whenever at least one drone is dispatchable (IDLE with battery > 40%). Source: agents/dispatcher.py.

The dispatcher is a SequentialAgent with two stages:

Analysts — a ParallelAgent running two fast-model sub-agents (Haiku by default):
- fleet_analyst summarizes the pool of idle drones (id, name, home base, battery).
- order_analyst summarizes the pending order (pickup base, dropoff point, payload weight).
Picker — dispatcher_picker on the main model (Sonnet by default). It receives both analyses as template variables and picks one drone by calling submit_dispatch(drone_id, reasoning).

The workflow reads the choice back from session state under DISPATCH_DECISION_KEY, validates the drone id against the current dispatchable list, and signals DroneWorkflow.assign_order. Any failure — LLM error, invalid id, session hiccup — falls back to a deterministic round-robin picker so orders keep flowing.

Anomaly handler

Picks a recovery action after an in-flight incident. Invoked from DeliveryWorkflow's exception handler when any activity in the mission loop raises (typically battery_critical during flight). Source: agents/anomaly.py.

The anomaly handler is a single main-model Agent (Sonnet by default). Its prompt describes the incident and includes live telemetry — current position, home-base distance, and nearest-base distance — read from Redis through the read_drone_telemetry activity. The agent picks one of three recovery actions by calling submit_recovery(action, reasoning):

Action	Behaviour
`abort_return_home`	Fly straight back to the drone's home base. Order fails.
`emergency_land_nearest_base`	Land at the closest base, which may not be home. Order fails.
`divert_to_recharge`	Fly to the nearest base, recharge, then fly home. Order fails.

submit_recovery coerces any unknown action to abort_return_home. If the agent run itself fails, the workflow also defaults to abort_return_home as a safety net, so the drone always has a defined recovery path. Recovery flights are executed through the fly_drone_to_base activity so the drone streams live telemetry on the way rather than teleporting.

License

This project is licensed under the Apache-2.0 License — see LICENSE for details.

Name		Name	Last commit message	Last commit date
Latest commit History 63 Commits
.claude		.claude
.github/workflows		.github/workflows
backend		backend
frontend		frontend
.env.example		.env.example
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
compose.yml		compose.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Durable Skies

Features

Prerequisites

Getting Started

Usage

Configuration

Development

Architecture

Agents

Dispatcher

Anomaly handler

License

About

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Durable Skies

Features

Prerequisites

Getting Started

Usage

Configuration

Development

Architecture

Agents

Dispatcher

Anomaly handler

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Contributors

Uh oh!

Languages