Skip to content

Timedout tasks do not show correct error messages #1469

Open
@shrinandj

Description

Steps that led to this issue:

  • Create a task with a timeout decorator. In this case, the timeout was 60 seconds.
  • Run the task on Kubernetes. This sets the Kubernetes jobs' activeDeadlineSeconds to 60 seconds.
  • If the task did not complete, the error seen by the user is:
[KILLED BY ORCHESTRATOR]
    Kubernetes error:
    Completed. This could be a transient error. Use @retry to retry.

Only after looking at the pod in K8s, it was seen that the pod had timed out. The status of the pod showed:

...
  containerStatuses:
  - containerID: containerd://99e036b3ba08df84f126ea41d05bb0842ea06540de49359d8c7d83e8aecc790b
...
   state:
      terminated:
        containerID: containerd://99e036b3ba08df84f126ea41d05bb0842ea06540de49359d8c7d83e8aecc790b
        exitCode: 137
        reason: Error
...
  message: Pod was active on the node longer than the specified deadline
  phase: Failed
...
  qosClass: Burstable
  reason: DeadlineExceeded

It will be good if the correct error message is brought up onto the user console so that they can see the timeout error.

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions