fix(api-v1): return real task.status in CreateTaskResponse, not hardcoded "pending" (#426)

AlexLiu190625 · web-flow · commit fc738eceaa3c · 2026-06-02T19:16:58.000+08:00
* fix(v1): return real task.status in CreateTaskResponse, not hardcoded "pending"

POST /v1/chat/tasks returned ``status="pending"`` in the response
body even though ``begin_turn`` had already atomically claimed the
row as RUNNING inside the same handler before the response was
sent. An SDK client doing POST followed by an immediate GET on the
same task would see two contradictory status values from
back-to-back calls (``pending`` then ``running``).

Read ``task.status.value`` after ``begin_turn`` returns instead --
``begin_turn`` refreshes the in-memory ``task`` after committing
the atomic claim, so the post-handler view reflects the row's real
state. ``AppendMessageResponse`` already followed this contract;
the create path now matches it.

Schema docstrings for CreateTaskResponse and AppendMessageResponse
updated to describe the post-claim ``running`` semantics; the
``test_create_task_happy_path`` assertion was checking for the old
``pending`` value and has been updated.

No behavior change in the orchestrator or scheduler -- this is a
response-payload correctness fix only. Twenty-eight v1 task tests
pass; pre-commit (ruff / mypy / isort / codespell) green.

* fix(v1): update create_chat_task docstring + unify AppendMessageResponse status read

Follow-up to the CreateTaskResponse status fix in the previous
commit. Two consistency cleanups within the same handler module:

1. ``create_chat_task`` function docstring still described the
   return value as ``status='pending'`` -- inconsistent with the
   schema docstring and the implementation that the previous commit
   already moved to ``task.status.value`` (i.e. 'running'). Updated
   the Returns section to match.

2. ``append_message_to_task`` returned ``status="running"`` hard-
   coded while ``create_chat_task`` reads ``task.status.value``.
   Unified both endpoints on the same pattern -- reading from the
   refreshed in-memory row is defensive (any future ``begin_turn``
   status-machine change is picked up automatically) and removes
   the asymmetry where one endpoint reflected the DB and the other
   asserted a fixed string.

No behavior change for the current contract -- the previous commit
already returned 'running' from CREATE; this commit makes APPEND
share the same expression form and fixes the docstring drift.
Twenty-eight v1 task tests pass; pre-commit green.

* fix(web): persist real exception text to task.error_message on bg failure

``execute_task_background`` previously logged and broadcast the
exception text on failure but never wrote it to ``task.error_message``.
The row's status stayed RUNNING, and ``finish_turn``'s
RUNNING-fallback branch then unconditionally set
``error_message`` to a generic placeholder ("Task execution failed
without status update; see /steps."), forcing SDK and web clients
to fetch ``/steps`` to discover what actually went wrong.

The fix writes the real exception text to ``task.error_message`` and
flips ``status`` to ``FAILED`` in the exception handler, using a
fresh session because the original may be in a failed-transaction
state. ``finish_turn``'s FAILED branch then only fills in a
placeholder when ``error_message`` is empty, so the real message is
preserved through to the final row state.

Same family of fix as the ``status="pending"`` correction above:
make the persisted task row reflect the real outcome rather than a
generic placeholder. Affects both SDK consumers (GET /v1/chat/tasks/
{id}) and web/WebSocket consumers reading the same row.

Twenty-eight v1 task tests pass; pre-commit (ruff / mypy / isort /
codespell) green.

* fix(web): scope bg-task failure handling by status, not by exception site

execute_task_background's outer except spans post-terminal steps
(assistant-message persistence and the completion/paused broadcasts)
that run after the task status was already committed COMPLETED. The
recently added FAILED-persistence wrote the row unconditionally, so a
failure in one of those best-effort steps -- e.g. the completion
broadcast losing its websocket -- rewrote an already-completed task as
FAILED and stored the broadcast error in error_message.

Branch the handler on the task's current status instead. Only a task
still RUNNING is a genuine execution failure: record the real exception
text, flip to FAILED, and emit task_error. A task already in a terminal
state tripped here in a best-effort post-completion step, so observe it
without touching the row or emitting a contradictory task_error;
finish_turn still reconciles the terminal fields afterward.

* fix(web): commit terminal task status only once the turn is durable

The success path committed COMPLETED/FAILED before persisting the
assistant message, which is a separate durable write. If that write
failed, the row was left COMPLETED with no message and no error_message
-- the status-gated failure handler treated it as a best-effort
post-completion step and left it untouched.

Leave the terminal status pending and let persist_assistant_message's
commit land it atomically with the message. A failure in that durable
write now leaves the status RUNNING, so the outer handler surfaces a
real task failure instead of a contradictory empty COMPLETED row. Only
notification broadcasts remain best-effort.

* fix(web): land terminal status on empty-reply turns too

The previous change rode the terminal-status commit on
persist_assistant_message's internal commit. But that helper
early-returns without committing when the assistant content is empty (a
valid empty-reply turn), which left the status pending -&gt; RUNNING -&gt;
finish_turn flipping a successful empty turn to FAILED.

Add an explicit commit after persistence. It lands the terminal status
whether or not a message row was written, while still surfacing a real
failure when persistence raises (control never reaches the explicit
commit, so the status stays uncommitted and the outer except fails it).
diff --git a/src/xagent/web/api/v1/tasks.py b/src/xagent/web/api/v1/tasks.py
@@ -95,8 +95,10 @@ async def create_chat_task(
 
     Returns:
         :class:`CreateTaskResponse` with the new ``task_id``,
-        ``agent_id``, ``status='pending'``, and ``created_at`` for the
-        caller to start polling from.
+        ``agent_id``, ``status='running'`` (the atomic claim inside
+        the handler flips the row from PENDING to RUNNING before the
+        response is sent), and ``created_at`` for the caller to
+        start polling from.
 
     Raises:
         V1ApiError 401: missing/invalid/revoked key (raised inside
@@ -160,10 +162,18 @@ async def create_chat_task(
     except TaskTurnError:
         raise V1ApiError(V1ErrorCode.TASK_BUSY, 409)
 
+    # ``status=task.status.value`` (i.e. 'running'), not 'pending':
+    # ``begin_turn`` ran an atomic UPDATE that flipped the row to
+    # RUNNING and ``db.refresh(task)``'d the in-memory object before
+    # returning. Returning 'pending' would lie to the SDK client --
+    # an immediately-following GET would see 'running' and the caller
+    # would have to reconcile two contradictory values from
+    # back-to-back calls. This matches the AppendMessageResponse
+    # contract below.
     return CreateTaskResponse(
         task_id=int(task.id),
         agent_id=int(agent.id),
-        status="pending",
+        status=task.status.value,
         created_at=task.created_at,
     )
 
@@ -311,16 +321,18 @@ async def append_message_to_task(
     # see a value that matches what they'd read from the DB directly
     # via GET /v1/chat/tasks/{id}, with no clock-skew between the two.
     #
-    # ``status='running'`` (not 'pending') because the atomic UPDATE
-    # above already flipped the row to RUNNING in the same transaction.
-    # Returning 'pending' here would lie to the SDK client: an
-    # immediately-following GET would see 'running' and the client
-    # would have to reconcile two contradictory values from
-    # back-to-back calls.
+    # ``status=task.status.value`` (i.e. 'running'), read from the
+    # refreshed in-memory row rather than hardcoded, mirrors the
+    # CreateTaskResponse contract above: the atomic UPDATE inside
+    # ``begin_turn`` flipped the row to RUNNING in the same
+    # transaction. Returning 'pending' here would lie to the SDK
+    # client -- an immediately-following GET would see 'running' and
+    # the caller would have to reconcile two contradictory values
+    # from back-to-back calls.
     return AppendMessageResponse(
         task_id=int(task.id),
         agent_id=int(agent.id),
-        status="running",
+        status=task.status.value,
         accepted_at=task.updated_at,
     )
 
diff --git a/src/xagent/web/api/websocket.py b/src/xagent/web/api/websocket.py
@@ -1442,9 +1442,18 @@ async def execute_task_background(
                     else:
                         task_updated.status = TaskStatus.FAILED
                     sync_workforce_run_status(db_new, task_updated, task_updated.status)
-                    db_new.commit()
+                    # Do NOT commit the terminal status here. Leave it
+                    # pending so the assistant-message persistence below
+                    # commits it atomically: the task is marked terminal
+                    # only once the turn is durably complete. If that write
+                    # fails, the status stays RUNNING and the outer except
+                    # surfaces a real failure -- instead of leaving a
+                    # COMPLETED row with no assistant message. Control
+                    # statuses (PAUSED / WAITING_FOR_USER) above commit
+                    # themselves; they have no assistant message to persist.
                     logger.info(
-                        f"Updated task {task_id} status to {task_updated.status.value}"
+                        f"Task {task_id} marked {task_updated.status.value} "
+                        "(pending commit with assistant message)"
                     )
                 else:
                     logger.info(
@@ -1482,6 +1491,17 @@ async def execute_task_background(
                         if isinstance(chat_response, dict)
                         else None,
                     )
+                    # Commit the pending terminal status. ``persist_assistant_message``
+                    # commits internally when it writes a row, but it
+                    # early-returns WITHOUT committing when the assistant
+                    # content is empty (a valid empty-reply turn). This
+                    # explicit commit lands the terminal status in that
+                    # case too, so an empty successful turn stays COMPLETED
+                    # rather than being left RUNNING (and later flipped to
+                    # FAILED by finish_turn). If persistence raised, control
+                    # never reaches here -- the status stays uncommitted and
+                    # the outer except surfaces a real failure.
+                    db_new.commit()
 
             # Materialize broadcast metadata into primitives BEFORE the
             # ``finally`` block closes ``db_new``. ``task_updated`` is
@@ -1596,25 +1616,57 @@ async def execute_task_background(
         logger.info(f"Background task {task_id} execution completed")
 
     except Exception as e:
-        logger.error(f"Background task {task_id} execution failed: {e}", exc_info=True)
-        # Send error event
+        # The outer try also spans the post-terminal steps -- assistant
+        # message persistence and the completion / paused broadcasts --
+        # that run *after* the task status was already committed terminal
+        # (COMPLETED above). ``_terminal_task_error_payload`` writes FAILED
+        # + the real error_message unconditionally, so gate it on the
+        # task's current status: only a task still RUNNING is a genuine
+        # execution failure. Otherwise a failed post-completion broadcast
+        # would rewrite an already-COMPLETED task as FAILED and store the
+        # broadcast error as the task's failure cause.
+        status_db = get_session_local()()
         try:
-            message = str(e)
-            await manager.broadcast_to_task(
-                {
-                    **_terminal_task_error_payload(
-                        task_id,
-                        message,
-                        event_type="task_error",
-                    ),
-                    "task_id": task_id,
-                    "error": message,
-                    "timestamp": datetime.now(timezone.utc).timestamp(),
-                },
-                task_id,
+            current = status_db.query(Task).filter(Task.id == task_id).first()
+            still_running = current is not None and (
+                current.status == TaskStatus.RUNNING
+            )
+        finally:
+            status_db.close()
+
+        if not still_running:
+            # Terminal state already committed; the exception came from a
+            # best-effort post-completion step. Observe it without touching
+            # the row or emitting a contradictory task_error. ``finish_turn``
+            # still reconciles the terminal fields afterward.
+            logger.warning(
+                f"Background task {task_id} post-terminal step failed; "
+                f"task state left unchanged: {e}",
+                exc_info=True,
+            )
+        else:
+            logger.error(
+                f"Background task {task_id} execution failed: {e}", exc_info=True
             )
-        except Exception as broadcast_error:
-            logger.error(f"Failed to send error notification: {broadcast_error}")
+            # Genuine failure: _terminal_task_error_payload persists FAILED
+            # + the real error_message and builds the notification payload.
+            try:
+                message = str(e)
+                await manager.broadcast_to_task(
+                    {
+                        **_terminal_task_error_payload(
+                            task_id,
+                            message,
+                            event_type="task_error",
+                        ),
+                        "task_id": task_id,
+                        "error": message,
+                        "timestamp": datetime.now(timezone.utc).timestamp(),
+                    },
+                    task_id,
+                )
+            except Exception as broadcast_error:
+                logger.error(f"Failed to send error notification: {broadcast_error}")
     except asyncio.CancelledError:
         logger.info(f"Background task {task_id} cancelled")
         raise
diff --git a/src/xagent/web/schemas/v1.py b/src/xagent/web/schemas/v1.py
@@ -177,18 +177,21 @@ class V1TemplateDetail(V1TemplateSummary):
 class CreateTaskResponse(BaseModel):
     """``POST /v1/chat/tasks`` -> 202 Accepted response.
 
-    The task has been persisted and queued for background execution;
-    callers poll ``GET /v1/chat/tasks/{task_id}`` to observe the
-    transition pending -> running -> completed/failed.
+    The task has been persisted, claimed as RUNNING in the same
+    transaction, and queued for background execution; callers poll
+    ``GET /v1/chat/tasks/{task_id}`` to observe the transition
+    running -> completed/failed.
     """
 
     task_id: int = Field(..., description="Newly created task primary key.")
     agent_id: int = Field(..., description="Agent the task is bound to.")
     status: str = Field(
         ...,
         description=(
-            "Initial status, always 'pending' in the 202 response. "
-            "Use GET /v1/chat/tasks/{task_id} to observe later transitions."
+            "Initial status, 'running' in the 202 response (the atomic "
+            "claim inside POST commits the status flip before the "
+            "response is sent). Use GET /v1/chat/tasks/{task_id} to "
+            "observe later transitions."
         ),
     )
     created_at: datetime = Field(..., description="UTC creation timestamp.")
@@ -230,7 +233,11 @@ class AppendMessageResponse(BaseModel):
     agent_id: int = Field(..., description="Agent the task is bound to.")
     status: str = Field(
         ...,
-        description="Initial status of the new turn, always 'pending'.",
+        description=(
+            "Initial status of the new turn, 'running' in the 202 "
+            "response (the atomic claim inside POST commits the status "
+            "flip before the response is sent)."
+        ),
     )
     accepted_at: datetime = Field(
         ...,
diff --git a/tests/web/api/v1/test_tasks.py b/tests/web/api/v1/test_tasks.py
@@ -113,7 +113,9 @@ def test_create_task_happy_path(mock_start_task):
     assert resp.status_code == 202, resp.text
     body = resp.json()
     assert body["agent_id"] == agent_id
-    assert body["status"] == "pending"
+    # POST atomically claims RUNNING before returning 202, so the
+    # response body reports the post-claim state, not 'pending'.
+    assert body["status"] == "running"
     assert "task_id" in body
     assert "created_at" in body
     task_id = body["task_id"]
diff --git a/tests/web/test_websocket_uploaded_files_context.py b/tests/web/test_websocket_uploaded_files_context.py