This document describes the TaskState enum, the state machine governing task
lifecycle, retry semantics, and how states appear in the dashboard.
A task is the unit of execution in Iris. Each job expands into one or more
tasks (controlled by replicas). Tasks are independently scheduled, retried,
and tracked. Job state is derived from task state counts -- there is no
independent job state machine.
+-----------+
| PENDING |<-----------------+
+-----+-----+ |
| |
dispatch to worker |
| |
v |
+-----------+ |
| ASSIGNED | |
+-----+-----+ |
| |
worker starts task |
| |
v |
+-----------+ |
| BUILDING | |
+-----+-----+ |
| |
build completes |
| |
v |
+-----------+ |
| RUNNING | |
+-----+-----+ |
| |
+-------------------+-------------------+ |
| | | |
v v v |
+-----------+ +-----------+ +------------+
| SUCCEEDED | | FAILED |---->| retry |
+-----------+ +-----------+ +------------+
| ^
| exhausted |
v |
(terminal) |
|
+-----------+ |
|WORKER_FAIL|------------+
+-----------+
|
| exhausted
v
(terminal)
+-----------+ |
| PREEMPTED | |
+-----------+ |
^ |
| exhausted |
| |
+-----------+ |
| preempt |------------+
| (ctrl) |
+-----------+
Other terminal states: KILLED, UNSCHEDULABLE (never retried)
| State | Proto Value | Terminal | Retriable | Set By | Dashboard Display |
|---|---|---|---|---|---|
UNSPECIFIED |
0 | -- | -- | Default zero value; never used in practice | unspecified (grey) |
PENDING |
1 | No | -- | Job submission (_on_job_submitted), retry requeue (_requeue_task) |
pending (amber) |
ASSIGNED |
9 | No | -- | Scheduler dispatch (_on_task_assigned / create_attempt) |
assigned (orange) |
BUILDING |
2 | No | -- | Worker heartbeat report; worker sets this during bundle download and dependency sync | building (purple) |
RUNNING |
3 | No | -- | Worker heartbeat report; worker sets this when user command starts | running (blue) |
SUCCEEDED |
4 | Yes | No | Worker heartbeat report; task exited with code 0 | succeeded (green) |
FAILED |
5 | Yes | Yes | Worker heartbeat report; task exited with non-zero code | failed (red) |
KILLED |
6 | Yes | No | Controller: job cancellation (_on_job_cancelled), job failure cascade (_mark_remaining_tasks_killed), per-task timeout |
killed (grey) |
WORKER_FAILED |
7 | Yes | Yes | Controller: worker death cascade (_on_worker_failed), coscheduled sibling kill |
worker_failed (purple) |
UNSCHEDULABLE |
8 | Yes | No | Controller: scheduling timeout expired (_mark_task_unschedulable) |
unschedulable (red) |
PREEMPTED |
10 | Yes | Yes | Controller: priority preemption with budget exhausted (preempt_task) |
preempted (orange) |
The initial state for every task. Set in two contexts:
-
Job submission:
_on_job_submittedcallsexpand_job_to_tasks, which createsControllerTaskobjects withstate=TASK_STATE_PENDING. Tasks are enqueued into the priority-sorted scheduling queue. -
Retry requeue:
_requeue_taskresetstask.statetoTASK_STATE_PENDINGand re-inserts the task into the scheduling queue. This happens after a retriableFAILEDorWORKER_FAILEDwhen retry budget remains.
Set by _on_task_assigned after the scheduler selects a worker and commits
resources. create_attempt creates a new ControllerTaskAttempt in
TASK_STATE_ASSIGNED state. The task is now bound to a specific worker and
consuming its resources.
The worker has not yet acknowledged the task -- it will receive the dispatch in the next heartbeat cycle.
Reported by the worker via heartbeat. The worker transitions internally:
PENDING -> BUILDINGwhen bundle download starts (task_attempt.py:433)- Later,
BUILDINGagain when dependency sync starts (task_attempt.py:549)
The controller processes this transition in complete_heartbeat. Note: if the
worker reports PENDING, the controller ignores it to prevent regressing an
ASSIGNED task and confusing the building-count backpressure window.
Reported by the worker via heartbeat after the user command starts executing
(task_attempt.py:570). The controller records started_at on the attempt.
Reported by the worker via heartbeat when the task process exits with code 0.
The controller sets exit_code=0, finished_at, and marks the task terminal.
No retry logic applies.
Reported by the worker via heartbeat when the task process exits with a non-zero code. Triggers retry evaluation:
handle_attempt_resultcalls_handle_failure, which incrementsfailure_countand compares againstmax_retries_failure.- If
failure_count <= max_retries_failure: returnsSHOULD_RETRY. The caller (_on_task_state_changed) calls_requeue_task, which resets state toPENDINGand re-enqueues the task. Resources are released from the current worker. - If
failure_count > max_retries_failure: returnsEXCEEDED_RETRY_LIMIT. The task remains inFAILEDstate and is terminal.errorandexit_codeare recorded.
Set by the controller in three scenarios:
-
User cancellation:
_on_job_cancellediterates non-terminal tasks and transitions each toKILLED. Tasks with workers assigned are queued for kill RPCs. -
Job failure cascade: When a job exceeds
max_task_failures,_finalize_job_statecalls_mark_remaining_tasks_killedto terminate all surviving tasks. -
Parent job termination:
_cancel_child_jobsrecursively cancels child jobs when a parent reaches a terminal state (exceptSUCCEEDED).
KILLED is always terminal and never retried.
Set by the controller when a worker dies. _on_worker_failed iterates all
tasks on the dead worker and emits TaskStateChangedEvent with
TASK_STATE_WORKER_FAILED for each non-terminal task.
Retry evaluation uses the preemption budget:
_handle_failureincrementspreemption_countand compares againstmax_retries_preemption(default: 100).- If budget remains:
SHOULD_RETRY-- task is requeued toPENDING. - If exhausted:
EXCEEDED_RETRY_LIMIT-- task stays inWORKER_FAILEDand is terminal.
Coscheduled jobs: When a task in a coscheduled (gang-scheduled) job fails
terminally, _cascade_coscheduled_failure exhausts the preemption budget of
all running siblings and transitions them to WORKER_FAILED (terminal). This
prevents other hosts from hanging on collective operations.
Set by the controller when a higher-priority task evicts a lower-priority
running task via preempt_task. The preemption loop (_run_preemption_pass)
selects victims from lower priority bands and calls preempt_task for each.
Retry evaluation uses _resolve_task_failure_state with the preemption budget:
- ASSIGNED tasks: always retry to PENDING regardless of budget (the task never started executing, so preemption is free).
- BUILDING or RUNNING tasks:
preemption_countis incremented and compared againstmax_retries_preemption.- If
preemption_count <= max_retries_preemption: task is requeued toPENDINGfor retry. The current attempt is markedPREEMPTED. - If
preemption_count > max_retries_preemption: task state is set toPREEMPTED(terminal). Both the attempt and the task arePREEMPTED.
- If
PREEMPTED is in both TERMINAL_TASK_STATES and FAILURE_TASK_STATES.
When a coscheduled task becomes terminally PREEMPTED, the job state is
recomputed. If all tasks in the job are terminal, _finalize_terminal_job
kills any remaining non-terminal tasks and cascades to child jobs. Note that
unlike WORKER_FAILED reported via heartbeat, preempt_task does not
directly cascade coscheduled siblings — the cascade only occurs through job
finalization.
Set by the controller's scheduling loop when a task's scheduling deadline
expires (_mark_task_unschedulable in controller.py). The deadline is
derived from the job's scheduling_timeout field.
UNSCHEDULABLE is always terminal. If any task becomes unschedulable, the
entire job transitions to JOB_STATE_UNSCHEDULABLE and all remaining tasks
are killed.
Iris maintains two independent retry budgets per task:
| Budget | Counter | Limit Field | Default | Trigger States |
|---|---|---|---|---|
| Failure | failure_count |
max_retries_failure |
0 (no retries) | FAILED |
| Preemption | preemption_count |
max_retries_preemption |
100 | WORKER_FAILED, PREEMPTED |
- Worker reports terminal state via heartbeat.
handle_attempt_resultdelegates to_handle_failure.- The appropriate counter is incremented.
- If
counter <= limit:TaskTransitionResult.SHOULD_RETRY._on_task_state_changedcalls_requeue_task.- Task state is reset to
PENDING. A new attempt will be created when the scheduler re-dispatches. - Worker resources are released via
_cleanup_task_resources.
- If
counter > limit:TaskTransitionResult.EXCEEDED_RETRY_LIMIT.- Task remains in its failure state and is terminal.
is_finished()returnsTrue.- The job's
_compute_job_statemay trigger a job-level state change (e.g.,JOB_STATE_FAILEDifmax_task_failuresis exceeded).
Only TASK_STATE_FAILED counts toward the job's max_task_failures threshold.
Worker failures and preemptions do not count. This means a job can survive
unlimited preemptions as long as the per-task preemption budget is not
exhausted. TASK_STATE_PREEMPTED and TASK_STATE_WORKER_FAILED are grouped
together for job state derivation: if all tasks are terminal and any are in
one of these states, the job becomes JOB_STATE_WORKER_FAILED.
SUCCEEDED: task completed successfullyKILLED: explicit termination by user or cascadeUNSCHEDULABLE: scheduling timeout expiredPREEMPTED: only when preemption budget is exhausted (otherwise retried asPENDING)
A task is considered finished (is_finished() == True) when:
| State | Condition |
|---|---|
SUCCEEDED |
Always finished |
KILLED |
Always finished |
UNSCHEDULABLE |
Always finished |
FAILED |
Finished when failure_count > max_retries_failure |
WORKER_FAILED |
Finished when preemption_count > max_retries_preemption |
PREEMPTED |
Finished when preemption_count > max_retries_preemption |
The distinction matters: a task in FAILED state with retry budget remaining
is in a terminal state at the attempt level but is not finished at the task
level. can_be_scheduled() returns True for such tasks.
The dashboard uses stateToName() from shared/utils.js to convert proto enum
strings (e.g., TASK_STATE_RUNNING) to lowercase display names by stripping the
TASK_STATE_ prefix. Each name maps to a CSS class status-{name}:
| Display Name | CSS Class | Color |
|---|---|---|
pending |
.status-pending |
Amber (#9a6700) |
assigned |
.status-assigned |
Orange (#bc4c00) |
building |
.status-building |
Purple (#8250df) |
running |
.status-running |
Blue (#0969da) |
succeeded |
.status-succeeded |
Green (#1a7f37) |
failed |
.status-failed |
Red (#cf222e) |
killed |
.status-killed |
Grey (#57606a) |
worker_failed |
.status-worker_failed |
Purple (#8250df) |
unschedulable |
.status-unschedulable |
Red (#cf222e) |
preempted |
.status-preempted |
Orange (#bc4c00) |
The job detail page shows per-task attempt history. Each attempt has its own state badge, and worker failures are annotated with "(worker failure)" in the attempt rows.
Pending tasks display a pending_reason diagnostic below the state badge when
the controller can identify why the task cannot be scheduled (e.g., no workers
match constraints).
Job state is computed from task state counts in _compute_job_state():
- SUCCEEDED: All tasks are in
TASK_STATE_SUCCEEDED. - FAILED: Count of
TASK_STATE_FAILEDtasks exceedsmax_task_failures. - UNSCHEDULABLE: Any task is
TASK_STATE_UNSCHEDULABLE. - KILLED: Any task is
TASK_STATE_KILLED(and job is not already terminal). - RUNNING: Any task is
ASSIGNED,BUILDING, orRUNNING. - PENDING: Default (no tasks have started).
The ordering matters -- earlier rules take priority. A job with one succeeded
task and one failed task (beyond tolerance) is FAILED, not RUNNING.