Skip to content

🐛 BUG: nebula certificate_ttl_seconds emits 0 until intialized #907

Open
@cp-samantha

Description

What version of nebula are you using?

1.7.2

What operating system are you using?

Linux

Describe the Bug

We are now monitoring our certificate expiries thanks to the new feature built straight into the prometheus metrics. In addition to this, we have set up monitoring rules which let us know if the ttl_seconds falls below a certain threshold.

We started noticing an issue whenever nebula clients are restarted with this monitoring in place, that directly after boot, the cert_ttl_seconds metric is emitting a 0 until the certificate itself is fully initialized within Nebula. This has the unfortunate side effect of firing our monitors every time a nebula client is restarted with this monitoring in place.

~# curl -s 127.0.0.1:8090/metrics | grep cert
# HELP nebula_certificate_ttl_seconds certificate.ttl_seconds
# TYPE nebula_certificate_ttl_seconds gauge
nebula_certificate_ttl_seconds 0
~# curl -s 127.0.0.1:8090/metrics | grep cert
# HELP nebula_certificate_ttl_seconds certificate.ttl_seconds
# TYPE nebula_certificate_ttl_seconds gauge
nebula_certificate_ttl_seconds 60803

It seems to me it would be better to not emit this metric at all if the value it provides is inaccurate -- prometheus will use the last value scraped until a new value is provided to update it.

Logs from affected hosts

No response

Config files from affected hosts

No response

Metadata

Assignees

No one assigned

    Labels

    NeedsDecisionFeedback is required from experts, contributors, and/or the community before a change can be made.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions