Description
Describe the bug
When I plug an iPhone, I see the Instance
resource created for it and the first thing my controller watching Instance
resource is to send a magic byte to enable CDC NCM network interface which is a USB config change that results in unplug/plug of the device and the dev path changes.
After that process, I see that the same Instance
resource now has a different dev path, eg /dev/bus/usb/001/007
becomes /dev/bus/usb/001/009
. However, the Pod
that has the Instance
name under its requests
, still sees the old path as the only USB device, hence reading that old file results in no such file
errors.
When I restart all the Akri pods, the Pod
sees the new path and works. What I'm guessing is that, somehow, kubelet is not updated with the new path of the Instance
and since restart results in re-registration of the plugin, it goes through all the devices and ends up registering the new path.
Output of kubectl get pods,akrii,akric -o wide
> kubectl get pods,akrii,akric -o wide -n akri-system
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
pod/akri-agent-daemonset-jc596 1/1 Running 1 (5d1h ago) 9d 10.244.0.127 talos-aqs-r1t <none> <none>
pod/akri-controller-deployment-745f4bfc4c-b2fnk 1/1 Running 1 (5d1h ago) 9d 10.244.0.124 talos-aqs-r1t <none> <none>
pod/akri-udev-discovery-daemonset-g4q5f 1/1 Running 1 (5d1h ago) 9d 10.244.0.129 talos-aqs-r1t <none> <none>
pod/akri-webhook-configuration-78666f968d-pfvpk 1/1 Running 1 (5d1h ago) 9d 10.244.0.128 talos-aqs-r1t <none> <none>
Kubernetes Version: [e.g. Native Kubernetes 1.19, MicroK8s 1.19, Minikube 1.19, K3s]
1.32.0 via Talos 1.9
To Reproduce
I will try to get a script going to be able to cleanly reproduce this but it'll have to require iOS hardware.
Expected behavior
When dev path changes in Instance
as a result of USB config change, the change should be reflected in the pod(s) that uses the Instance
as well and the processes trying to access the USB path should see the new path and be able to read/write.
Logs (please share snips of applicable logs)
Here is the initial Instance
resource:
apiVersion: akri.sh/v0
kind: Instance
metadata:
creationTimestamp: "2025-01-08T10:43:37Z"
generation: 1
name: device-pod-ios-agent-1afb6f
namespace: mobile-device-system
ownerReferences:
- apiVersion: akri.sh/v0
controller: true
kind: Configuration
name: device-pod-ios-agent
uid: d68db8d0-064f-4514-b805-8888241be5d5
resourceVersion: "2124065"
uid: 68808b9f-1c85-451e-a8a2-03eff26b1d06
spec:
brokerProperties:
UDEV_DEVNODE_0: /dev/bus/usb/001/019
UDEV_DEVPATH: /devices/pci0000:00/0000:00:14.0/usb1/1-9
capacity: 1
cdiName: akri.sh/device-pod-ios-agent=1afb6f
configurationName: device-pod-ios-agent
deviceUsage: {}
nodes:
- talos-aqs-r1t
shared: false
After sending the USB config bytes, here is how it's changed by Akri:
apiVersion: akri.sh/v0
kind: Instance
metadata:
creationTimestamp: "2025-01-08T10:43:37Z"
finalizers:
- talos-aqs-r1t
generation: 3
labels:
attributes.platform.qawolf.com/usb-network-enabled: "true"
name: device-pod-ios-agent-1afb6f
namespace: mobile-device-system
ownerReferences:
- apiVersion: akri.sh/v0
controller: true
kind: Configuration
name: device-pod-ios-agent
uid: d68db8d0-064f-4514-b805-8888241be5d5
resourceVersion: "2124125"
uid: 68808b9f-1c85-451e-a8a2-03eff26b1d06
spec:
brokerProperties:
UDEV_DEVNODE_0: /dev/bus/usb/001/020
UDEV_DEVPATH: /devices/pci0000:00/0000:00:14.0/usb1/1-9
capacity: 1
cdiName: akri.sh/device-pod-ios-agent=1afb6f
configurationName: device-pod-ios-agent
deviceUsage:
device-pod-ios-agent-1afb6f-0: talos-aqs-r1t
nodes:
- talos-aqs-r1t
shared: false
My controller puts the attributes.platform.qawolf.com/usb-network-enabled: "true"
only after it confirms that the device is in the correct config, e.g. after it changes to the new dev path. Then a Pod
is created only if that label is present.
Here is how the Pod events look like:
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 4m default-scheduler Successfully assigned mobile-device-system/device-pod-ios-agent-1afb6f-1dsbg to talos-aqs-r1t
Warning FailedScheduling 4m10s default-scheduler 0/1 nodes are available: 1 Insufficient akri.sh/device-pod-ios-agent-1afb6f. preemption: 0/1 nodes are available: 1 No preemption victims found for incoming pod.
Warning Failed 4m kubelet Error: failed to generate container "660f4629cd79181d29c9e1c0e94fa9c8d2de6b859393827021dec086ceca1f65" spec: failed to apply OCI options: lstat /dev/bus/usb/001/019: no such file or directory
Warning Failed 3m59s kubelet Error: failed to generate container "61bdc29ac4676c714c866e6a8b4c866e27ddc03876b0c0f232d9d8cf4e28be29" spec: failed to apply OCI options: lstat /dev/bus/usb/001/019: no such file or directory
Warning Failed 3m45s kubelet Error: failed to generate container "ffdc0446e6bb6b701315896f085d369a435cce13c878e78f24d9a77106ff0b60" spec: failed to apply OCI options: lstat /dev/bus/usb/001/019: no such file or directory
Here is the excerpt of the agent logs:
[2025-01-08T10:43:34Z TRACE agent::plugin_manager::device_plugin_slot_reclaimer] reclaiming unused slots - start
[2025-01-08T10:43:34Z TRACE agent::plugin_manager::device_plugin_slot_reclaimer] register - before call to register with the kubelet at socket /var/lib/kubelet/pod-resources/kubelet.sock
[2025-01-08T10:43:37Z TRACE agent::discovery_handler_manager::registration_socket] Received new message from discovery handler: DiscoverResponse { devices: [Device { id: "/devices/pci0000:00/0000:00:14.0/usb1/1-9", properties: {"UDEV_DEVPATH": "/devices/pci0000:00/0000:00:14.0/usb1/1-9", "UDEV_DEVNODE_0": "/dev/bus/usb/001/019"}, mounts: [], device_specs: [DeviceSpec { container_path: "/dev/bus/usb/001/019", host_path: "/dev/bus/usb/001/019", permissions: "rwm" }] }] }
[2025-01-08T10:43:37Z TRACE agent::discovery_handler_manager::discovery_handler_registry] Ask for reconciliation of mobile-device-system::device-pod-ios-agent
[2025-01-08T10:43:37Z TRACE agent::util::discovery_configuration_controller] Reconciling Some("mobile-device-system")::device-pod-ios-agent
[2025-01-08T10:43:37Z TRACE agent::plugin_manager::device_plugin_instance_controller] Plugin Manager: Reconciling device-pod-ios-agent-1afb6f
[2025-01-08T10:43:37Z INFO agent::plugin_manager::device_plugin_runner] serve - creating a device plugin server that will listen at: /var/lib/kubelet/device-plugins/device-pod-ios-agent-1afb6f-1736333017.sock
[2025-01-08T10:43:38Z INFO agent::plugin_manager::device_plugin_runner] register - entered for Instance akri.sh/device-pod-ios-agent-1afb6f and socket_name: device-pod-ios-agent-1afb6f-1736333017.sock
[2025-01-08T10:43:38Z TRACE agent::plugin_manager::device_plugin_runner] register - before call to register with the kubelet at socket /var/lib/kubelet/device-plugins/kubelet.sock
[2025-01-08T10:43:38Z INFO agent::plugin_manager::device_plugin_runner] serve - creating a device plugin server that will listen at: /var/lib/kubelet/device-plugins/device-pod-ios-agent-1736333018.sock
[2025-01-08T10:43:38Z INFO agent::plugin_manager::device_plugin_instance_controller] list_and_watch - kubelet called list_and_watch for instance device-pod-ios-agent-1afb6f
[2025-01-08T10:43:38Z TRACE agent::plugin_manager::device_plugin_instance_controller] Sending devices to kubelet: [Device { id: "device-pod-ios-agent-1afb6f-0", health: "Healthy", topology: None }]
[2025-01-08T10:43:39Z INFO agent::plugin_manager::device_plugin_runner] register - entered for Instance akri.sh/device-pod-ios-agent and socket_name: device-pod-ios-agent-1736333018.sock
[2025-01-08T10:43:39Z TRACE agent::plugin_manager::device_plugin_runner] register - before call to register with the kubelet at socket /var/lib/kubelet/device-plugins/kubelet.sock
[2025-01-08T10:43:39Z INFO agent::plugin_manager::device_plugin_instance_controller] list_and_watch - kubelet called list_and_watch for Configuration device-pod-ios-agent
[2025-01-08T10:43:39Z TRACE agent::plugin_manager::device_plugin_instance_controller] Plugin Manager: Reconciling device-pod-ios-agent-1afb6f
[2025-01-08T10:43:44Z TRACE agent::plugin_manager::device_plugin_slot_reclaimer] reclaiming unused slots - start
[2025-01-08T10:43:44Z TRACE agent::plugin_manager::device_plugin_slot_reclaimer] register - before call to register with the kubelet at socket /var/lib/kubelet/pod-resources/kubelet.sock
[2025-01-08T10:43:46Z TRACE agent::plugin_manager::device_plugin_runner] kubelet called allocate Request { metadata: MetadataMap { headers: {"content-type": "application/grpc", "user-agent": "grpc-go/1.65.0", "te": "trailers", "grpc-accept-encoding": "gzip"} }, message: AllocateRequest { container_requests: [ContainerAllocateRequest { devices_i_ds: ["device-pod-ios-agent-1afb6f-0"] }] }, extensions: Extensions }
[2025-01-08T10:43:46Z INFO agent::plugin_manager::device_plugin_instance_controller] allocate - kubelet called allocate for Instance device-pod-ios-agent-1afb6f
[2025-01-08T10:43:46Z TRACE agent::plugin_manager::device_plugin_instance_controller] Sending devices to kubelet: [Device { id: "device-pod-ios-agent-1afb6f-0", health: "Healthy", topology: None }]
[2025-01-08T10:43:46Z TRACE agent::plugin_manager::device_plugin_instance_controller] Plugin Manager: Reconciling device-pod-ios-agent-1afb6f
[2025-01-08T10:43:46Z TRACE agent::plugin_manager::device_plugin_instance_controller] Plugin Manager: Reconciling device-pod-ios-agent-1afb6f
[2025-01-08T10:43:54Z TRACE agent::discovery_handler_manager::registration_socket] Received new message from discovery handler: DiscoverResponse { devices: [Device { id: "/devices/pci0000:00/0000:00:14.0/usb1/1-9", properties: {"UDEV_DEVNODE_0": "/dev/bus/usb/001/020", "UDEV_DEVPATH": "/devices/pci0000:00/0000:00:14.0/usb1/1-9"}, mounts: [], device_specs: [DeviceSpec { container_path: "/dev/bus/usb/001/020", host_path: "/dev/bus/usb/001/020", permissions: "rwm" }] }] }
[2025-01-08T10:43:54Z TRACE agent::discovery_handler_manager::discovery_handler_registry] Ask for reconciliation of mobile-device-system::device-pod-ios-agent
[2025-01-08T10:43:54Z TRACE agent::util::discovery_configuration_controller] Reconciling Some("mobile-device-system")::device-pod-ios-agent
[2025-01-08T10:43:54Z TRACE agent::plugin_manager::device_plugin_instance_controller] Plugin Manager: Reconciling device-pod-ios-agent-1afb6f
[2025-01-08T10:43:54Z TRACE agent::plugin_manager::device_plugin_slot_reclaimer] reclaiming unused slots - start
[2025-01-08T10:43:54Z TRACE agent::plugin_manager::device_plugin_slot_reclaimer] register - before call to register with the kubelet at socket /var/lib/kubelet/pod-resources/kubelet.sock
[2025-01-08T10:44:04Z TRACE agent::plugin_manager::device_plugin_slot_reclaimer] reclaiming unused slots - start
[2025-01-08T10:44:04Z TRACE agent::plugin_manager::device_plugin_slot_reclaimer] register - before call to register with the kubelet at socket /var/lib/kubelet/pod-resources/kubelet.sock
[2025-01-08T10:44:14Z TRACE agent::plugin_manager::device_plugin_slot_reclaimer] reclaiming unused slots - start
[2025-01-08T10:44:14Z TRACE agent::plugin_manager::device_plugin_slot_reclaimer] register - before call to register with the kubelet at socket /var/lib/kubelet/pod-resources/kubelet.sock
[2025-01-08T10:44:14Z TRACE agent::plugin_manager::device_plugin_slot_reclaimer] freeing slot: akri.sh/device-pod-ios-agent-1afb6f-0
[2025-01-08T10:44:14Z WARN agent::plugin_manager::device_plugin_slot_reclaimer] Failed to free slot akri.sh/device-pod-ios-agent-1afb6f-0, will try again in 10s
[2025-01-08T10:44:24Z TRACE agent::plugin_manager::device_plugin_slot_reclaimer] reclaiming unused slots - start
[2025-01-08T10:44:24Z TRACE agent::plugin_manager::device_plugin_slot_reclaimer] register - before call to register with the kubelet at socket /var/lib/kubelet/pod-resources/kubelet.sock
[2025-01-08T10:44:24Z TRACE agent::plugin_manager::device_plugin_slot_reclaimer] freeing slot: akri.sh/device-pod-ios-agent-1afb6f-0
[2025-01-08T10:44:24Z WARN agent::plugin_manager::device_plugin_slot_reclaimer] Failed to free slot akri.sh/device-pod-ios-agent-1afb6f-0, will try again in 10s
[2025-01-08T10:44:34Z TRACE agent::plugin_manager::device_plugin_slot_reclaimer] reclaiming unused slots - start
[2025-01-08T10:44:34Z TRACE agent::plugin_manager::device_plugin_slot_reclaimer] register - before call to register with the kubelet at socket /var/lib/kubelet/pod-resources/kubelet.sock
[2025-01-08T10:44:34Z TRACE agent::plugin_manager::device_plugin_slot_reclaimer] freeing slot: akri.sh/device-pod-ios-agent-1afb6f-0
[2025-01-08T10:44:34Z WARN agent::plugin_manager::device_plugin_slot_reclaimer] Failed to free slot akri.sh/device-pod-ios-agent-1afb6f-0, will try again in 10s
[2025-01-08T10:44:44Z TRACE agent::plugin_manager::device_plugin_slot_reclaimer] reclaiming unused slots - start
[2025-01-08T10:44:44Z TRACE agent::plugin_manager::device_plugin_slot_reclaimer] register - before call to register with the kubelet at socket /var/lib/kubelet/pod-resources/kubelet.sock
Note that deleting the pod and re-creating it does not make a difference. However, if I delete the agent pod and let the new one come up, it all starts to work. Here is the agent logs after a new one comes up:
akri.sh Agent start
akri.sh KUBERNETES_PORT found ... env_logger::init
[2025-01-08T10:50:00Z TRACE agent] akri.sh KUBERNETES_PORT found ... env_logger::init finished
[2025-01-08T10:50:00Z INFO akri_shared::akri::metrics] starting metrics server on port 8080 at /metrics
[2025-01-08T10:50:00Z INFO agent::discovery_handler_manager::registration_socket] internal_run_registration_server - entered
[2025-01-08T10:50:00Z TRACE agent::discovery_handler_manager::registration_socket] internal_run_registration_server - registration server listening on socket /var/lib/akri/agent-registration.sock
[2025-01-08T10:50:00Z TRACE agent::plugin_manager::device_plugin_slot_reclaimer] reclaiming unused slots - start
[2025-01-08T10:50:00Z TRACE agent::plugin_manager::device_plugin_slot_reclaimer] register - before call to register with the kubelet at socket /var/lib/kubelet/pod-resources/kubelet.sock
[2025-01-08T10:50:00Z TRACE agent::plugin_manager::device_plugin_instance_controller] Plugin Manager: Reconciling device-pod-ios-agent-1afb6f
[2025-01-08T10:50:00Z TRACE agent::util::discovery_configuration_controller] Reconciling Some("mobile-device-system")::device-pod-ios-agent
[2025-01-08T10:50:00Z WARN agent::plugin_manager::device_plugin_instance_controller] Error during reconciliation of Instance Some("mobile-device-system")::device-pod-ios-agent-1afb6f, retrying in 1s: UnknownDevice("akri.sh/device-pod-ios-agent=1afb6f")
[2025-01-08T10:50:00Z WARN agent::util::discovery_configuration_controller] Error during reconciliation for Some("mobile-device-system")::device-pod-ios-agent, retrying in 1s: DiscoveryError(NoHandler("udev"))
[2025-01-08T10:50:01Z TRACE agent::util::discovery_configuration_controller] Reconciling Some("mobile-device-system")::device-pod-ios-agent
[2025-01-08T10:50:01Z TRACE agent::plugin_manager::device_plugin_instance_controller] Plugin Manager: Reconciling device-pod-ios-agent-1afb6f
[2025-01-08T10:50:01Z WARN agent::util::discovery_configuration_controller] Error during reconciliation for Some("mobile-device-system")::device-pod-ios-agent, retrying in 2s: DiscoveryError(NoHandler("udev"))
[2025-01-08T10:50:01Z WARN agent::plugin_manager::device_plugin_instance_controller] Error during reconciliation of Instance Some("mobile-device-system")::device-pod-ios-agent-1afb6f, retrying in 2s: UnknownDevice("akri.sh/device-pod-ios-agent=1afb6f")
[2025-01-08T10:50:03Z TRACE agent::plugin_manager::device_plugin_instance_controller] Plugin Manager: Reconciling device-pod-ios-agent-1afb6f
[2025-01-08T10:50:03Z TRACE agent::util::discovery_configuration_controller] Reconciling Some("mobile-device-system")::device-pod-ios-agent
[2025-01-08T10:50:03Z WARN agent::plugin_manager::device_plugin_instance_controller] Error during reconciliation of Instance Some("mobile-device-system")::device-pod-ios-agent-1afb6f, retrying in 4s: UnknownDevice("akri.sh/device-pod-ios-agent=1afb6f")
[2025-01-08T10:50:03Z WARN agent::util::discovery_configuration_controller] Error during reconciliation for Some("mobile-device-system")::device-pod-ios-agent, retrying in 4s: DiscoveryError(NoHandler("udev"))
[2025-01-08T10:50:07Z TRACE agent::util::discovery_configuration_controller] Reconciling Some("mobile-device-system")::device-pod-ios-agent
[2025-01-08T10:50:07Z TRACE agent::plugin_manager::device_plugin_instance_controller] Plugin Manager: Reconciling device-pod-ios-agent-1afb6f
[2025-01-08T10:50:07Z WARN agent::plugin_manager::device_plugin_instance_controller] Error during reconciliation of Instance Some("mobile-device-system")::device-pod-ios-agent-1afb6f, retrying in 8s: UnknownDevice("akri.sh/device-pod-ios-agent=1afb6f")
[2025-01-08T10:50:07Z TRACE agent::discovery_handler_manager::registration_socket] NetworkEndpoint::query - connecting to external udev discovery handler over network
[2025-01-08T10:50:07Z TRACE agent::plugin_manager::device_plugin_instance_controller] Plugin Manager: Reconciling device-pod-ios-agent-1afb6f
[2025-01-08T10:50:10Z TRACE agent::plugin_manager::device_plugin_slot_reclaimer] reclaiming unused slots - start
[2025-01-08T10:50:10Z TRACE agent::plugin_manager::device_plugin_slot_reclaimer] register - before call to register with the kubelet at socket /var/lib/kubelet/pod-resources/kubelet.sock
[2025-01-08T10:50:15Z TRACE agent::discovery_handler_manager::registration_socket] Received new message from discovery handler: DiscoverResponse { devices: [Device { id: "/devices/pci0000:00/0000:00:14.0/usb1/1-9", properties: {"UDEV_DEVNODE_0": "/dev/bus/usb/001/020", "UDEV_DEVPATH": "/devices/pci0000:00/0000:00:14.0/usb1/1-9"}, mounts: [], device_specs: [DeviceSpec { container_path: "/dev/bus/usb/001/020", host_path: "/dev/bus/usb/001/020", permissions: "rwm" }] }] }
[2025-01-08T10:50:15Z TRACE agent::discovery_handler_manager::discovery_handler_registry] Ask for reconciliation of mobile-device-system::device-pod-ios-agent
[2025-01-08T10:50:15Z TRACE agent::util::discovery_configuration_controller] Reconciling Some("mobile-device-system")::device-pod-ios-agent
[2025-01-08T10:50:15Z TRACE agent::plugin_manager::device_plugin_instance_controller] Plugin Manager: Reconciling device-pod-ios-agent-1afb6f
[2025-01-08T10:50:15Z INFO agent::plugin_manager::device_plugin_runner] serve - creating a device plugin server that will listen at: /var/lib/kubelet/device-plugins/device-pod-ios-agent-1afb6f-1736333415.sock
[2025-01-08T10:50:16Z INFO agent::plugin_manager::device_plugin_runner] register - entered for Instance akri.sh/device-pod-ios-agent-1afb6f and socket_name: device-pod-ios-agent-1afb6f-1736333415.sock
[2025-01-08T10:50:16Z TRACE agent::plugin_manager::device_plugin_runner] register - before call to register with the kubelet at socket /var/lib/kubelet/device-plugins/kubelet.sock
[2025-01-08T10:50:16Z INFO agent::plugin_manager::device_plugin_runner] serve - creating a device plugin server that will listen at: /var/lib/kubelet/device-plugins/device-pod-ios-agent-1736333416.sock
[2025-01-08T10:50:16Z INFO agent::plugin_manager::device_plugin_instance_controller] list_and_watch - kubelet called list_and_watch for instance device-pod-ios-agent-1afb6f
[2025-01-08T10:50:16Z TRACE agent::plugin_manager::device_plugin_instance_controller] Sending devices to kubelet: [Device { id: "device-pod-ios-agent-1afb6f-0", health: "Healthy", topology: None }]
[2025-01-08T10:50:17Z INFO agent::plugin_manager::device_plugin_runner] register - entered for Instance akri.sh/device-pod-ios-agent and socket_name: device-pod-ios-agent-1736333416.sock
[2025-01-08T10:50:17Z TRACE agent::plugin_manager::device_plugin_runner] register - before call to register with the kubelet at socket /var/lib/kubelet/device-plugins/kubelet.sock
[2025-01-08T10:50:17Z TRACE agent::plugin_manager::device_plugin_instance_controller] Plugin Manager: Reconciling device-pod-ios-agent-1afb6f
[2025-01-08T10:50:17Z INFO agent::plugin_manager::device_plugin_instance_controller] list_and_watch - kubelet called list_and_watch for Configuration device-pod-ios-agent
[2025-01-08T10:50:20Z TRACE agent::plugin_manager::device_plugin_slot_reclaimer] reclaiming unused slots - start
[2025-01-08T10:50:20Z TRACE agent::plugin_manager::device_plugin_slot_reclaimer] register - before call to register with the kubelet at socket /var/lib/kubelet/pod-resources/kubelet.sock
[2025-01-08T10:50:23Z TRACE agent::plugin_manager::device_plugin_runner] kubelet called allocate Request { metadata: MetadataMap { headers: {"content-type": "application/grpc", "user-agent": "grpc-go/1.65.0", "te": "trailers", "grpc-accept-encoding": "gzip"} }, message: AllocateRequest { container_requests: [ContainerAllocateRequest { devices_i_ds: ["device-pod-ios-agent-1afb6f-0"] }] }, extensions: Extensions }
[2025-01-08T10:50:23Z INFO agent::plugin_manager::device_plugin_instance_controller] allocate - kubelet called allocate for Instance device-pod-ios-agent-1afb6f
[2025-01-08T10:50:23Z TRACE agent::plugin_manager::device_plugin_instance_controller] Sending devices to kubelet: [Device { id: "device-pod-ios-agent-1afb6f-0", health: "Healthy", topology: None }]
[2025-01-08T10:50:23Z TRACE agent::plugin_manager::device_plugin_instance_controller] Plugin Manager: Reconciling device-pod-ios-agent-1afb6f
[2025-01-08T10:50:30Z TRACE agent::plugin_manager::device_plugin_slot_reclaimer] reclaiming unused slots - start
[2025-01-08T10:50:30Z TRACE agent::plugin_manager::device_plugin_slot_reclaimer] register - before call to register with the kubelet at socket /var/lib/kubelet/pod-resources/kubelet.sock
[2025-01-08T10:50:40Z TRACE agent::plugin_manager::device_plugin_slot_reclaimer] reclaiming unused slots - start
[2025-01-08T10:50:40Z TRACE agent::plugin_manager::device_plugin_slot_reclaimer] register - before call to register with the kubelet at socket /var/lib/kubelet/pod-resources/kubelet.sock
[2025-01-08T10:50:50Z TRACE agent::plugin_manager::device_plugin_slot_reclaimer] reclaiming unused slots - start
[2025-01-08T10:50:50Z TRACE agent::plugin_manager::device_plugin_slot_reclaimer] register - before call to register with the kubelet at socket /var/lib/kubelet/pod-resources/kubelet.sock
[2025-01-08T10:50:50Z TRACE agent::plugin_manager::device_plugin_slot_reclaimer] freeing slot: akri.sh/device-pod-ios-agent-1afb6f-0
[2025-01-08T10:50:50Z WARN agent::plugin_manager::device_plugin_slot_reclaimer] Failed to free slot akri.sh/device-pod-ios-agent-1afb6f-0, will try again in 10s
[2025-01-08T10:51:00Z TRACE agent::plugin_manager::device_plugin_slot_reclaimer] reclaiming unused slots - start
[2025-01-08T10:51:00Z TRACE agent::plugin_manager::device_plugin_slot_reclaimer] register - before call to register with the kubelet at socket /var/lib/kubelet/pod-resources/kubelet.sock
[2025-01-08T10:51:00Z TRACE agent::plugin_manager::device_plugin_slot_reclaimer] freeing slot: akri.sh/device-pod-ios-agent-1afb6f-0
[2025-01-08T10:51:00Z WARN agent::plugin_manager::device_plugin_slot_reclaimer] Failed to free slot akri.sh/device-pod-ios-agent-1afb6f-0, will try again in 10s
[2025-01-08T10:51:10Z TRACE agent::plugin_manager::device_plugin_slot_reclaimer] reclaiming unused slots - start
[2025-01-08T10:51:10Z TRACE agent::plugin_manager::device_plugin_slot_reclaimer] register - before call to register with the kubelet at socket /var/lib/kubelet/pod-resources/kubelet.sock
[2025-01-08T10:51:10Z TRACE agent::plugin_manager::device_plugin_slot_reclaimer] freeing slot: akri.sh/device-pod-ios-agent-1afb6f-0
[2025-01-08T10:51:10Z WARN agent::plugin_manager::device_plugin_slot_reclaimer] Failed to free slot akri.sh/device-pod-ios-agent-1afb6f-0, will try again in 10s
[2025-01-08T10:51:20Z TRACE agent::plugin_manager::device_plugin_slot_reclaimer] reclaiming unused slots - start
[2025-01-08T10:51:20Z TRACE agent::plugin_manager::device_plugin_slot_reclaimer] register - before call to register with the kubelet at socket /var/lib/kubelet/pod-resources/kubelet.sock
[2025-01-08T10:51:20Z TRACE agent::plugin_manager::device_plugin_slot_reclaimer] freeing slot: akri.sh/device-pod-ios-agent-1afb6f-0
[2025-01-08T10:51:20Z WARN agent::plugin_manager::device_plugin_slot_reclaimer] Failed to free slot akri.sh/device-pod-ios-agent-1afb6f-0, will try again in 10s
Here is the events of the new pod that's created for the Instance
:
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 2m20s default-scheduler 0/1 nodes are available: 1 Insufficient akri.sh/device-pod-ios-agent-1afb6f. preemption: 0/1 nodes are available: 1 No preemption victims found for incoming pod.
Normal Scheduled 2m12s default-scheduler Successfully assigned mobile-device-system/device-pod-ios-agent-1afb6f-u7uze to talos-aqs-r1t
Additional context
I'm experienced in Go but have practically zero experience in Rust. Here is a draft change that I was able to write with help from Cursor editor running with Claude 3.5 Sonnet. It's hacky but that was the only simple solution it was able to come up with. I'll update here and the PR once my testing is completed.
Metadata
Metadata
Assignees
Labels
Type
Projects
Status
Backlog