Problem debugging with EphemeralContainers #885
-
Hi, thanks for your work k3d is great! I have a problem enabling or using EphemeralContainers
I think the feature-gate is enabled (ps -auxfww):
then I am following steps in https://kubernetes.io/docs/tasks/debug-application-cluster/debug-running-pod/#ephemeral-container and I get this error at debug time
What am I missing? Any help is appreciated. k3d/k3s versionsk3d version v4.4.6 |
Beta Was this translation helpful? Give feedback.
Replies: 9 comments
-
I was trying the same thing tonight and managed to have it working. The command for creating the cluster is the following
|
Beta Was this translation helpful? Give feedback.
-
According to K8s doc you need a cluster v1.22 |
Beta Was this translation helpful? Give feedback.
-
Hi @sgandon , thanks a lot. I updated my environment with no luck: k3d v5.1.0 Now the schema error is gone but "kubectl debug" hangs indefinitely waiting for a reply. The issue is described here: https://stackoverflow.com/questions/65306212/kubectl-debug-hangs-on-1-20-with-feature-gate-enabled As far as I understand the container is created as a resource in the control plane (text below comes from a describe of the target pod) ephemeralContainers:
however it is not instantiated by the k3s agent... maybe the EphemeralContainers=true flag must be propagated to the k3s agent too? any help is appreciated |
Beta Was this translation helpful? Give feedback.
-
Can you give the commands you are using, because the command I pasted above is working for me, then I did use the k8s example
|
Beta Was this translation helpful? Give feedback.
-
Hi @sgandon, thanks again. You are right the command you pasted is working as expected with latest k3d/k3s. It must be something related to the way I use the k3d with configuration file. k3d cluster create --config cluster.yaml cluster.yaml:
I thought the extra args statement was wrong but the resulting command executed is correct (?)
any idea how to debug this? thanks a lot |
Beta Was this translation helpful? Give feedback.
-
Hi @sgandon, all, guess I figured out at least the apparent cause. If the ephemeral-demo pod is scheduled on anything different then the server node, the ephemeral container does not start. Could you please try your command with -a 4? Make sure ephemeral-demo is scheduled on agent-N (via nodeSelector for example). It might also land on agent-N if you execute cluster creation and the pod run after a very short time like “k3d cluster create ... && kubectl run ephemeral-demo ...“. Under this condition, the subsequent debug command is failing in my environment. hope this helps |
Beta Was this translation helpful? Give feedback.
-
I found I way to deterministically replicate the problem
|
Beta Was this translation helpful? Give feedback.
-
Hi @ffatghub , thanks for starting this discussion and sorry for my late reply! So far, you've enabled the feature gate at the API-Server and you may have to enable it on the kube-scheduler and the kubelets as well: k3d cluster create k3d-122 -i rancher/k3s:v1.22.3-k3s1 -a 4 --k3s-arg '--kube-apiserver-arg=feature-gates=EphemeralContainers=true@server:*' --k3s-arg='--kube-scheduler-arg=feature-gates=EphemeralContainers=true@server:* --k3s-arg='--kubelet-arg=feature-gates=EphemeralContainers=true@agent:*'' At least this worked for with the following commands: $ kubectl run ephemeral-demo --image=k8s.gcr.io/pause:3.1 --restart=Never --overrides='{ "spec": { "nodeSelector": { "kubernetes.io/hostname": "k3d-k3d-122-agent-1" } } }'
pod/ephemeral-demo created
$ kubectl debug -it ephemeral-demo --image=busybox --target=ephemeral-demo
Targeting container "ephemeral-demo". If you don't see processes from this container it may be because the container runtime doesn't support this feature.
Defaulting debug container name to debugger-vn2zg.
If you don't see a command prompt, try pressing enter.
/ # ps aux
PID USER TIME COMMAND
1 root 0:00 /pause
8 root 0:00 sh
14 root 0:00 ps aux
/ # hostname
ephemeral-demo
/ #
# in a different shell
kdp ephemeral-demo
Name: ephemeral-demo
Namespace: default
Priority: 0
Node: k3d-k3d-122-agent-1/172.21.0.6
Start Time: Fri, 10 Dec 2021 14:00:24 +0100
# ... truncated ...
Containers:
ephemeral-demo:
Container ID: containerd://bf9ea56190c16edf397cc01cffd566fcf3caa2c17e1bb44c8898a40be4885d73
Image: k8s.gcr.io/pause:3.1
Image ID: docker.io/rancher/mirrored-pause@sha256:82f4f58a1877a3a819de055a111b33fc63127b896b928b0bd7419da549f88d65
Port: <none>
Host Port: <none>
State: Running
Started: Fri, 10 Dec 2021 14:00:25 +0100
Ready: True
Restart Count: 0
Environment: <none>
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-fzcg4 (ro)
Ephemeral Containers:
debugger-vn2zg:
Container ID: containerd://a68b4bde3ef8319e53538f1a3c177a9e02647238370e26ced3296a05aa5dbeb8
Image: busybox
Image ID: docker.io/library/busybox@sha256:b5cfd4befc119a590ca1a81d6bb0fa1fb19f1fbebd0397f25fae164abe1e8a6a
Port: <none>
Host Port: <none>
State: Running
Started: Fri, 10 Dec 2021 14:00:38 +0100
Ready: False
Restart Count: 0
Environment: <none>
Mounts: <none>
# ... truncated ...
Node-Selectors: kubernetes.io/hostname=k3d-k3d-122-agent-1
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 2m40s default-scheduler Successfully assigned default/ephemeral-demo to k3d-k3d-122-agent-1
Normal Pulling 2m41s kubelet Pulling image "k8s.gcr.io/pause:3.1"
Normal Pulled 2m40s kubelet Successfully pulled image "k8s.gcr.io/pause:3.1" in 702.025356ms
Normal Created 2m40s kubelet Created container ephemeral-demo
Normal Started 2m40s kubelet Started container ephemeral-demo
Normal Pulling 2m30s kubelet Pulling image "busybox"
Normal Pulled 2m27s kubelet Successfully pulled image "busybox" in 3.054293788s
Normal Created 2m27s kubelet Created container debugger-vn2zg
Normal Started 2m27s kubelet Started container debugger-vn2zg
Update: I also updated the FAQ entry containing this use case as an example to reflect all three args: https://k3d.io/v5.2.1/faq/faq/#passing-additional-argumentsflags-to-k3s-and-on-to-eg-the-kube-apiserver |
Beta Was this translation helpful? Give feedback.
-
Hi @iwilltry42, nice, thank you! Enabling the feature gate in all CO components (API-Server, kube-scheduler and kubelets) makes the e phemeral container up and running :-) Below the config file version of your cli-args example. ciao
|
Beta Was this translation helpful? Give feedback.
Hi @ffatghub , thanks for starting this discussion and sorry for my late reply!
@sgandon , thanks for chiming in here 🙏
So far, you've enabled the feature gate at the API-Server and you may have to enable it on the kube-scheduler and the kubelets as well:
At least this worked for with the following commands:
$ kubectl run ephemeral-demo --image=k8s.gcr.io/pause:3.1 --restart=Never --overrides='{ "sp…