What is missing?
Large clusters will do a rolling restart in a very slow fashion. We need to allow rolling restart to take down an entire rack at once to speed up this process.
Why is this needed?
It is unacceptable that large clusters might take days to restart.