Leader election lost

### Please confirm the following

- [x] I agree to follow this project's [code of conduct](https://docs.ansible.com/ansible/latest/community/code_of_conduct.html).
- [x] I have checked the [current issues](https://github.com/ansible/awx-operator/issues) for duplicates.
- [x] I understand that the AWX Operator is open source software provided for free and that I might not receive a timely response.

### Bug Summary

Basically the same as #808, but there was no answer in that request, so I would like to bring this up again.

Normal awx-operator deployment, after a random amount of time, few hours, awx operator starts to loose grip on it's own monitoring?
Keeps restarting itself then gives up and stays in 1/2 CrashLoopBackOff

`16:16:22.208428       7 leaderelection.go:332] error retrieving resource lock awx/awx-operator: Get "https://10.96.0.1:443/apis/coordinati
on.k8s.io/v1/namespaces/awx/leases/awx-operator": context deadline exceeded
I0409 16:16:22.217943       7 leaderelection.go:285] failed to renew lease awx/awx-operator: timed out waiting for the condition
{"level":"info","ts":"2026-04-09T16:16:22Z","msg":"Stopping and waiting for non leader election runnables"}
{"level":"info","ts":"2026-04-09T16:16:22Z","msg":"Stopping and waiting for leader election runnables"}
{"level":"info","ts":"2026-04-09T16:16:22Z","msg":"Stopping and waiting for caches"}
{"level":"info","ts":"2026-04-09T16:16:22Z","msg":"Stopping and waiting for webhooks"}
{"level":"info","ts":"2026-04-09T16:16:22Z","msg":"Stopping and waiting for HTTP servers"}
{"level":"info","ts":"2026-04-09T16:16:22Z","msg":"Wait completed, proceeding to shutdown the manager"}
{"level":"info","ts":"2026-04-09T16:16:22Z","msg":"Shutdown signal received, waiting for all workers to finish","controller":"awx-controller"}
{"level":"info","ts":"2026-04-09T16:16:22Z","msg":"Shutdown signal received, waiting for all workers to finish","controller":"awxrestore-control
ler"}
{"level":"info","ts":"2026-04-09T16:16:22Z","msg":"Shutdown signal received, waiting for all workers to finish","controller":"awxmeshingress-con
troller"}
{"level":"info","ts":"2026-04-09T16:16:22Z","msg":"Shutdown signal received, waiting for all workers to finish","controller":"awxbackup-controll
er"}
{"level":"error","ts":"2026-04-09T16:16:22Z","logger":"cmd","msg":"Proxy or operator exited with error.","error":"leader election lost","stacktr
ace":"github.com/operator-framework/ansible-operator-plugins/internal/cmd/ansible-operator/run.run\n\tansible-operator-plugins/internal/cmd/ansi
ble-operator/run/cmd.go:261\ngithub.com/operator-framework/ansible-operator-plugins/internal/cmd/ansible-operator/run.NewCmd.func1\n\tansible-op
erator-plugins/internal/cmd/ansible-operator/run/cmd.go:81\ngithub.com/spf13/cobra.(*Command).execute\n\t/home/runner/go/pkg/mod/github.com/spf1
3/cobra@v1.8.0/command.go:987\ngithub.com/spf13/cobra.(*Command).ExecuteC\n\t/home/runner/go/pkg/mod/github.com/spf13/cobra@v1.8.0/command.go:11
15\ngithub.com/spf13/cobra.(*Command).Execute\n\t/home/runner/go/pkg/mod/github.com/spf13/cobra@v1.8.0/command.go:1039\nmain.main\n\tansible-ope
rator-plugins/cmd/ansible-operator/main.go:40\nruntime.main\n\t/opt/hostedtoolcache/go/1.20.12/x64/src/runtime/proc.go:250"}
`



### AWX Operator version

2.19.1

### AWX version

24.6.1

### Kubernetes platform

minikube

### Kubernetes/Platform version

1,38,1

### Modifications

no

### Steps to reproduce

simple deployment with ansible helm chart

    - name: Deploy AWX Operator, patience pls.
      kubernetes.core.helm:
        name: awx-operator
        chart_ref: awx-operator
        chart_repo_url: https://ansible-community.github.io/awx-operator-helm/
        release_namespace: "{{ awx_namespace }}"
        create_namespace: true
        wait: true
...

    - name: deploy AWX pods
      kubernetes.core.k8s:
        state: present
        wait: true
        definition: "{{ lookup('template', 'awx-manifest.yml.j2') | from_yaml }}"


### Expected results

Well, I would expect it to run and able to read the monitor URL continuously.


### Actual results

Instead, after some time it fails:
pod describe:

`Events:
  Type     Reason     Age                  From               Message
  ----     ------     ----                 ----               -------
  Normal   Scheduled  20m                  default-scheduler  Successfully assigned awx/awx-operator-controller-manager-5f468697-r5trg to awx-mi
ni
  Normal   Pulled     20m                  kubelet            Container image "quay.io/brancz/kube-rbac-proxy:v0.15.0" already present on machin
e and can be accessed by the pod
  Normal   Created    20m                  kubelet            Container created
  Normal   Started    20m                  kubelet            Container started
  Warning  Unhealthy  14m                  kubelet            Liveness probe failed: Get "http://10.244.0.22:6789/healthz": EOF
  Warning  Unhealthy  12m (x3 over 19m)    kubelet            Readiness probe failed: Get "http://10.244.0.22:6789/readyz": context deadline exc
eeded (Client.Timeout exceeded while awaiting headers)
  Warning  Unhealthy  12m (x3 over 19m)    kubelet            Liveness probe failed: Get "http://10.244.0.22:6789/healthz": context deadline exc
eeded (Client.Timeout exceeded while awaiting headers)
  Warning  Unhealthy  3m25s (x2 over 12m)  kubelet            Readiness probe failed: Get "http://10.244.0.22:6789/readyz": EOF
  Warning  BackOff    111s (x16 over 18m)  kubelet            Back-off restarting failed container awx-manager in pod awx-operator-controller-ma
nager-5f468697-r5trg_awx(b3c159ee-057a-49c5-8128-776564327dcc)
  Normal   Pulled     39s (x7 over 20m)    kubelet            Container image "quay.io/ansible/awx-operator:2.19.1" already present on machine a
nd can be accessed by the pod
  Normal   Created    38s (x7 over 20m)    kubelet            Container created
  Normal   Started    38s (x7 over 20m)    kubelet            Container started
`

get pods:

`awx           awx-migration-24.6.1-r95kc                       0/1     Completed          0               16h
awx           awx-operator-controller-manager-5f468697-r5trg   1/2     CrashLoopBackOff   3 (35s ago)     9m5s
awx           awx-task-7ccf57f75c-gfw4z                        4/4     Running            0               16h
awx           awx-web-5d78fb7c79-2wg57                         3/3     Running            0               16h
`

### Additional information


Can it be that the livenessprobe and readynessprobe timeouts are too low? 1sec?

### Operator Logs

`
...
-------------------------------------------------------------------------------
{"level":"info","ts":"2026-04-10T08:37:56Z","logger":"proxy","msg":"Read object from cache","resource":{"IsResourceRequest":true,"Path":"/apis/a
pps/v1/namespaces/awx/deployments/awx-task","Verb":"get","APIPrefix":"apis","APIGroup":"apps","APIVersion":"v1","Namespace":"awx","Resource":"de
ployments","Subresource":"","Name":"awx-task","Parts":["deployments","awx-task"]}}
{"level":"info","ts":"2026-04-10T08:37:58Z","logger":"proxy","msg":"Read object from cache","resource":{"IsResourceRequest":true,"Path":"/apis/a
pps/v1/namespaces/awx/deployments/awx-task","Verb":"get","APIPrefix":"apis","APIGroup":"apps","APIVersion":"v1","Namespace":"awx","Resource":"de
ployments","Subresource":"","Name":"awx-task","Parts":["deployments","awx-task"]}}
E0410 08:38:13.190444       7 leaderelection.go:369] Failed to update lock: client rate limiter Wait returned an error: context deadline exceede
d
I0410 08:38:13.306497       7 leaderelection.go:285] failed to renew lease awx/awx-operator: timed out waiting for the condition
{"level":"info","ts":"2026-04-10T08:38:13Z","msg":"Stopping and waiting for non leader election runnables"}
{"level":"error","ts":"2026-04-10T08:38:13Z","logger":"cmd","msg":"Proxy or operator exited with error.","error":"leader election lost","stacktr
ace":"github.com/operator-framework/ansible-operator-plugins/internal/cmd/ansible-operator/run.run\n\tansible-operator-plugins/internal/cmd/ansi
ble-operator/run/cmd.go:261\ngithub.com/operator-framework/ansible-operator-plugins/internal/cmd/ansible-operator/run.NewCmd.func1\n\tansible-op
erator-plugins/internal/cmd/ansible-operator/run/cmd.go:81\ngithub.com/spf13/cobra.(*Command).execute\n\t/home/runner/go/pkg/mod/github.com/spf1
3/cobra@v1.8.0/command.go:987\ngithub.com/spf13/cobra.(*Command).ExecuteC\n\t/home/runner/go/pkg/mod/github.com/spf13/cobra@v1.8.0/command.go:11
15\ngithub.com/spf13/cobra.(*Command).Execute\n\t/home/runner/go/pkg/mod/github.com/spf13/cobra@v1.8.0/command.go:1039\nmain.main\n\tansible-ope
rator-plugins/cmd/ansible-operator/main.go:40\nruntime.main\n\t/opt/hostedtoolcache/go/1.20.12/x64/src/runtime/proc.go:250"}
`

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Leader election lost #2112

Please confirm the following

Bug Summary

AWX Operator version

AWX version

Kubernetes platform

Kubernetes/Platform version

Modifications

Steps to reproduce

Expected results

Actual results

Additional information

Operator Logs

`
...

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Leader election lost #2112

Description

Please confirm the following

Bug Summary

AWX Operator version

AWX version

Kubernetes platform

Kubernetes/Platform version

Modifications

Steps to reproduce

Expected results

Actual results

Additional information

Operator Logs

` ...

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

`
...