Is your feature request related to a problem? Please describe.
If the Ra data directory is on a disk which is completely exhausted, multiple processes in the Ra supervision tree can crash permanently from hitting max restart intensity, leaving the Raft subsystem unavailable. This has a few consequences for RabbitMQ:
- Since
ra_log_ets is terminated, ra_directory ETS and DETS tables are closed so ra:force_delete_server/2 always fails, so QQs cannot be deleted to reclaim space. (ra_directory:where_is_parent/2 crashes in ra_server_sup_sup:prepare_server_stop/2.)
- When using Khepri, the metadata store server process stops, so stale data can't be served by local queries or projections.
- Since user metadata is stored here, you can't log into the server.
- This also causes deletion of all vhosts since
rabbit_vhost_process's regular check notices that the vhost is not returned by the metadata store.
Describe the solution you'd like
Ideally we would stop accepting writes once we see enospc but keep existing servers running and serving reads. Maybe we can shut down the WAL process or enter a 'read-only' mode that stops accepting writes but does not crash repeatedly.
Describe alternatives you've considered
Alternatively for RabbitMQ, we could crash RabbitMQ when ra_sup exits. RabbitMQ continues to run despite Ra being offline and that leads to other consequences listed above. We could have an enospc completely crash the server to avoid those consequences.
Additional context
Reproduction steps for local enospc on a single node...
This only works on Linux as macOS doesn't have tmpfs. For this test I'm running off of the tip of v4.2.x (05eee5deb9ecd40beed549c31e4349781fd004ff at time of writing).
mkdir /tmp/raft-data
sudo mount -t tmpfs -o size=500M data /tmp/raft-data
-
# raft.conf
raft.data_dir = /tmp/raft-data
# Tuning down WAL size makes ra_log_wal start and crash faster
# in recovery, increasing chances of hitting max restart intensity.
raft.wal_max_size_bytes = 67108864
make run-broker RABBITMQ_CONFIG_FILE=raft.conf
perf-test -qq -u qq -qpf 1 -qpt 10 -qp qq-%d -x 5 -y 0 -c 100 --rate 1000
Eventually perf-test will start throwing com.rabbitmq.client.AuthenticationFailureException: ACCESS_REFUSED as the metadata store is returning no users.
Is your feature request related to a problem? Please describe.
If the Ra data directory is on a disk which is completely exhausted, multiple processes in the Ra supervision tree can crash permanently from hitting max restart intensity, leaving the Raft subsystem unavailable. This has a few consequences for RabbitMQ:
ra_log_etsis terminated,ra_directoryETS and DETS tables are closed sora:force_delete_server/2always fails, so QQs cannot be deleted to reclaim space. (ra_directory:where_is_parent/2crashes inra_server_sup_sup:prepare_server_stop/2.)rabbit_vhost_process's regular check notices that the vhost is not returned by the metadata store.Describe the solution you'd like
Ideally we would stop accepting writes once we see
enospcbut keep existing servers running and serving reads. Maybe we can shut down the WAL process or enter a 'read-only' mode that stops accepting writes but does not crash repeatedly.Describe alternatives you've considered
Alternatively for RabbitMQ, we could crash RabbitMQ when
ra_supexits. RabbitMQ continues to run despite Ra being offline and that leads to other consequences listed above. We could have anenospccompletely crash the server to avoid those consequences.Additional context
Reproduction steps for local
enospcon a single node...This only works on Linux as macOS doesn't have tmpfs. For this test I'm running off of the tip of
v4.2.x(05eee5deb9ecd40beed549c31e4349781fd004ffat time of writing).mkdir /tmp/raft-datasudo mount -t tmpfs -o size=500M data /tmp/raft-datamake run-broker RABBITMQ_CONFIG_FILE=raft.confperf-test -qq -u qq -qpf 1 -qpt 10 -qp qq-%d -x 5 -y 0 -c 100 --rate 1000Eventually
perf-testwill start throwingcom.rabbitmq.client.AuthenticationFailureException: ACCESS_REFUSEDas the metadata store is returning no users.