Skip to content

[operator] healthchecks stuck in RUNNING state #3197

Open
@fruch

Description

@fruch

during k8s functional test, we something get into a sitiuation, the health checks are not continuing

< t:2022-07-12 08:33:01,259 f:cli.py          l:1061 c:sdcm.mgmt.cli        p:DEBUG > Issuing: 'sctool tasks -c db77d799-dfc0-4ff4-91f5-81d792ccfa49'
< t:2022-07-12 08:33:01,259 f:remote_base.py  l:520  c:KubernetesCmdRunner  p:DEBUG > Running command "sctool tasks -c db77d799-dfc0-4ff4-91f5-81d792ccfa49"...
< t:2022-07-12 08:33:01,372 f:base.py         l:222  c:KubernetesCmdRunner  p:DEBUG > +------------------------+-------------+--------+----------+---------+-------+--------------+------------+---------+------+
< t:2022-07-12 08:33:01,372 f:base.py         l:222  c:KubernetesCmdRunner  p:DEBUG > | Task                   | Schedule    | Window | Timezone | Success | Error | Last Success | Last Error | Status  | Next |
< t:2022-07-12 08:33:01,372 f:base.py         l:222  c:KubernetesCmdRunner  p:DEBUG > +------------------------+-------------+--------+----------+---------+-------+--------------+------------+---------+------+
< t:2022-07-12 08:33:01,372 f:base.py         l:222  c:KubernetesCmdRunner  p:DEBUG > | healthcheck/rest       | @every 1m0s |        | UTC      | 12      | 2     | 9m50s ago    | 8m50s ago  | RUNNING |      |
< t:2022-07-12 08:33:01,372 f:base.py         l:222  c:KubernetesCmdRunner  p:DEBUG > | healthcheck/alternator | @every 15s  |        | UTC      | 53      | 6     | 9m5s ago     | 7m26s ago  | RUNNING |      |
< t:2022-07-12 08:33:01,372 f:base.py         l:222  c:KubernetesCmdRunner  p:DEBUG > | healthcheck/cql        | @every 15s  |        | UTC      | 53      | 6     | 9m5s ago     | 3m5s ago   | RUNNING |      |
< t:2022-07-12 08:33:01,372 f:base.py         l:222  c:KubernetesCmdRunner  p:DEBUG > +------------------------+-------------+--------+----------+---------+-------+--------------+------------+---------+------+

and after 10 minutes, all health check in running state, and not done as we expect:

< t:2022-07-12 08:43:14,095 f:base.py         l:142  c:KubernetesCmdRunner  p:DEBUG > Command "sctool tasks -c db77d799-dfc0-4ff4-91f5-81d792ccfa49" finished with status 0
< t:2022-07-12 08:43:14,095 f:cli.py          l:1064 c:sdcm.mgmt.cli        p:DEBUG > sctool output: +------------------------+-------------+--------+----------+---------+-------+--------------+------------+---------+------+
< t:2022-07-12 08:43:14,095 f:cli.py          l:1064 c:sdcm.mgmt.cli        p:DEBUG > | Task                   | Schedule    | Window | Timezone | Success | Error | Last Success | Last Error | Status  | Next |
< t:2022-07-12 08:43:14,095 f:cli.py          l:1064 c:sdcm.mgmt.cli        p:DEBUG > +------------------------+-------------+--------+----------+---------+-------+--------------+------------+---------+------+
< t:2022-07-12 08:43:14,095 f:cli.py          l:1064 c:sdcm.mgmt.cli        p:DEBUG > | healthcheck/rest       | @every 1m0s |        | UTC      | 12      | 2     | 20m3s ago    | 19m3s ago  | RUNNING |      |
< t:2022-07-12 08:43:14,095 f:cli.py          l:1064 c:sdcm.mgmt.cli        p:DEBUG > | healthcheck/alternator | @every 15s  |        | UTC      | 53      | 6     | 19m18s ago   | 17m39s ago | RUNNING |      |
< t:2022-07-12 08:43:14,095 f:cli.py          l:1064 c:sdcm.mgmt.cli        p:DEBUG > | healthcheck/cql        | @every 15s  |        | UTC      | 53      | 6     | 19m18s ago   | 13m18s ago | RUNNING |      |
< t:2022-07-12 08:43:14,095 f:cli.py          l:1064 c:sdcm.mgmt.cli        p:DEBUG > +------------------------+-------------+--------+----------+---------+-------+--------------+------------+---------+------+

Logs

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions