Skip to content

Conversation

@DougReeder
Copy link
Member

@DougReeder DougReeder commented Jan 9, 2026

What?

Catches exceptions in 'HealthController.index/2' method & logs error and location in our code at level :error

Why?

So it can be understood why reticulum is unhealthy and point toward what needs to be fixed

Examples

old log message

request_id=GIjrZYolRUXFTPEAAOXF [info] GET /health
...
request_id=GIjrZYolRUXFTPEAAOXF [warning] Failed to send Sentry event.Cannot send Sentry event because of invalid DSN

Sentry is an error reporting tool which is not configured — this message is unrelated to the actual problem.

new log message

request_id=GIkhm_C-aE93flYAADlB [info] GET /health
request_id=GIkhm_C-aE93flYAADlB [error] Health check failed at health_controller.ex:14: %Protocol.UndefinedError{protocol: Enumerable, value: nil, description: ""}

In this case, we know that some variable is nil at health_controller.ex line 14, and something is trying to enumerate nil

How to test

  1. run mix test (or look at CI), observe that new automated test for reticulum being unhealthy passes
  2. deploy to instance,
  3. from a pod, run curl -i curl http://ret:4001/health & observe that "ok" is still returned
  4. run kubectl scale deploy spoke --replicas=0 -n hcce to kill Spoke
  5. run kubectl scale deploy reticulum --replicas=0 -n hcce then run kubectl scale deploy reticulum --replicas=1 -n hcce to restart reticulum with empty caches
  6. run kubectl logs -l app=reticulum -f -n hcce; observe log message including "[error] Health check failed at health_controller.ex:16: %Protocol.UndefinedError{protocol: Enumerable, value: nil, description: ""}" showing that the Hubs tests pass, but the Spoke test fails.

Documentation of functionality

No documentation change; this just produces better logs when reticulum is not healthy

Open questions

Could Cachex and RoomAssigner be mocked, so an automated test for reticulum being healthy could be written?

Additional details or related context

Written with the help of JetBrains' Junie LLM, but reworked by me.
The end result is similar to page_controller.ex lines 844–847.

Why: :warn is obsolete and the test framework complains
Why: So it can be understood *why* reticulum is unhealthy

Written with the help of JetBrains' Junie LLM, but reworked by me.
The end result is essentially a copy of page_controller.ex lines 844–847.
@DougReeder DougReeder marked this pull request as draft January 9, 2026 08:28
Why: to make log messages more useful for debugging

The code to extract the stack frame entry was written with the help of JetBrains' Junie LLM.
@DougReeder DougReeder marked this pull request as ready for review January 9, 2026 17:41
@DougReeder DougReeder requested a review from Exairnous January 9, 2026 17:41
@DougReeder DougReeder changed the title Catches exceptions in 'HealthController.index/2' & logs message Health probe failures log useful message Jan 15, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant