Skip to content

ShadowPod Deletion Changes Pods in Succeeded or Failed Phase to a Pending Phase #3096

@josephsirak

Description

@josephsirak

Is there an existing issue for this?

  • I have searched the existing issues

Version

0.10.3

What happened?

When the ShadowPod in the provider cluster gets deleted either directly or because the offloaded namespace was recreated, Liqo changes the state of pods that are in terminal phases (Succeeded & Failed), into pending status. We have seen pods that ran over 40 days ago coming back into a pending state and running because of this bug. This is a major issue for batch workloads and one-shot pods, where they get back into a pending state and run again or clog the queue until they run. The main reason why this appears to be happening is that the workload namespace mapper doesn't check if the local pod is in a terminal phase when it detects a ShadowPod was deleted. Interestingly the fallback handler does correctly ignore pods in a succeeded phase, but not for failed pods.

Relevant log output

How can we reproduce the issue?

  1. Using a k8s batch job definition launch a pod that gets reflected from the consumer to the provider cluster and wait for it to get to a completed state.
  2. find the corresponding ShadowPod definition and delete it.
    -> You should see the source pod going into a pending or running state immediately and the ShadowPod getting recreated.

You can also trigger a similar situation for failed pods by creating a job pod that exits with a non-zero value, waiting for it to get into a failed phase, and then deleting the corresponding ShadowPod in the provider cluster.

Provider or distribution

EKS, K3

CNI version

No response

Kernel Version

No response

Kubernetes Version

1.31

Code of Conduct

  • I agree to follow this project's Code of Conduct

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugReport a bug encountered while operating Liqo

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions