AutoscalingListener not recreated after node failure - CR never gets deletionTimestamp

### Checks

- [x] I've already read https://docs.github.com/en/actions/hosting-your-own-runners/managing-self-hosted-runners-with-actions-runner-controller/troubleshooting-actions-runner-controller-errors and I'm sure my issue is not covered in the troubleshooting guide.
- [x] I am using charts that are officially provided

### Controller Version

0.14.1 (gha-runner-scale-set-controller)

### Deployment Method

Helm

### Checks

- [x] This isn't a question or user support case (For Q&A and community support, go to [Discussions](https://github.com/actions/actions-runner-controller/discussions)).
- [x] I've read the [Changelog](https://github.com/actions/actions-runner-controller/blob/master/docs/gha-runner-scale-set-controller/README.md#changelog) before submitting this issue and I'm sure it's not due to any recently-introduced backward-incompatible changes

### To Reproduce

```markdown
## Steps to Reproduce
  1. Run ARC with replicaCount=2 on a multi-node cluster
  2. Power off the node running the listener pod
  3. Observe listener pod stuck Terminating forever
  4. Check AutoscalingListener CR - no deletionTimestamp is ever set
  5. Check controller logs - zero activity on AutoscalingListener after startup
```

### Describe the bug

  ## ARC Version
  0.14.1 (gha-runner-scale-set-controller)

  ## Kubernetes Version
  v1.28.5+k3s1

  ## Description
  When a worker node running a listener pod is powered off or crashes, the
  AutoscalingListener CR never gets a deletionTimestamp set. The ARC controller
  does not initiate cleanup or recreation of the listener, resulting in permanent
  loss of job pickup for that scale set until manual intervention.


Even after 5 minutes as well, listener pod is not scheduled on another worker node. It was in still terminating mode. 

### Describe the expected behavior

## Expected Behavior
  ARC controller detects listener pod is gone due to node failure and either:
  - Sets deletionTimestamp on the AutoscalingListener CR and recreates it, OR
  - Has a health check timeout that triggers listener recreation

### Additional Context

```yaml
## Evidence
  kubectl get autoscalinglistener <name> -n arc-systems \
    -o jsonpath='{.metadata.deletionTimestamp}'
  # Returns empty — no deletion ever requested

  Controller logs show only EphemeralRunnerSet activity, zero AutoscalingListener
  reconciliation after the node failure.
```

### Controller Logs

```shell
https://gist.github.com/vikasraut499/83b0d07133cf4ddd7d612206b08312fc
```

### Runner Pod Logs

```shell
root@control-plane-ha-2:~/tetrate/cloud/baremetal/terraform# kubectl get pods -n arc-systems -o wide -w
NAME                                             READY   STATUS        RESTARTS   AGE   IP            NODE                NOMINATED NODE   READINESS GATES
arc-gha-rs-controller-69697bf97-nnl8m            1/1     Running       0          69m   10.42.5.81    worker-2-ha-setup   <none>           <none>
arc-gha-rs-controller-69697bf97-r2h5m            1/1     Terminating   0          62m   10.42.3.222   worker-1-ha-setup   <none>           <none>
arc-gha-rs-controller-69697bf97-tqvcm            1/1     Running       0          57m   10.42.5.82    worker-2-ha-setup   <none>           <none>
monorepo-baremetal-runner-ha-764b4bd4-listener   1/1     Running       0          70m   10.42.5.80    worker-2-ha-setup   <none>           <none>
xcp-baremetal-runner-ha-764b4bd4-listener        1/1     Terminating   0          70m   10.42.3.217   worker-1-ha-setup   <none>           <none>
xcp-baremetal-runner-ha-rlr6j-runner-47vps       2/2     Terminating   0          70m   10.42.3.218   worker-1-ha-setup   <none>           <none>
xcp-baremetal-runner-ha-rlr6j-runner-fxqz2       2/2     Terminating   0          70m   10.42.3.220   worker-1-ha-setup   <none>           <none>
xcp-baremetal-runner-ha-rlr6j-runner-j4rvd       2/2     Terminating   0          70m   10.42.3.219   worker-1-ha-setup   <none>           <none>
xcp-baremetal-runner-ha-rlr6j-runner-vzn95       2/2     Running       0          38m   10.42.5.83    worker-2-ha-setup   <none>           <none>
^Croot@control-plane-ha-2:~/tetrate/cloud/baremetal/terraform#

^Croot@control-plane-ha-2:~/tetrate/cloud/baremetal/terraform# kubectl logs xcp-baremetal-runner-ha-rlr6j-runner-j4rvd -n arc-systems
Defaulted container "runner" out of: runner, dind, init-dind-externals (init), init-harbor-creds (init)
Error from server: Get "https://172.16.0.89:10250/containerLogs/arc-systems/xcp-baremetal-runner-ha-rlr6j-runner-j4rvd/runner": proxy error from 127.0.0.1:6443 while dialing 172.16.0.89:10250, code 502: 502 Bad Gateway
root@control-plane-ha-2:~/tetrate/cloud/baremetal/terraform# kubectl logs xcp-baremetal-runner-ha-rlr6j-runner-j4rvd -n arc-systems
Defaulted container "runner" out of: runner, dind, init-dind-externals (init), init-harbor-creds (init)
Error from server: Get "https://172.16.0.89:10250/containerLogs/arc-systems/xcp-baremetal-runner-ha-rlr6j-runner-j4rvd/runner": proxy error from 127.0.0.1:6443 while dialing 172.16.0.89:10250, code 502: 502 Bad Gateway
r
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AutoscalingListener not recreated after node failure - CR never gets deletionTimestamp #4469

Checks

Controller Version

Deployment Method

Checks

To Reproduce

Describe the bug

ARC Version

Kubernetes Version

Description

Describe the expected behavior

Expected Behavior

Additional Context

Controller Logs

Runner Pod Logs

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

AutoscalingListener not recreated after node failure - CR never gets deletionTimestamp #4469

Description

Checks

Controller Version

Deployment Method

Checks

To Reproduce

Describe the bug

ARC Version

Kubernetes Version

Description

Describe the expected behavior

Expected Behavior

Additional Context

Controller Logs

Runner Pod Logs

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions