Allow rolling restart to restart an entire rack in parallel

### What is missing?

Large clusters will do a rolling restart in a very slow fashion. We need to allow rolling restart to take down an entire rack at once to speed up this process. 

### Why is this needed?

It is unacceptable that large clusters might take days to restart.