A long time ago we limited parallelism in the CI, both in .cirrus.yml and ci/run-ci.
I believe one reason we needed to that was to deal with the very expensive compilation of code for generated AST nodes which could put a lot of pressure on memory. That particular code went away with #1462, so we might be able to either fully use all available CPUs again (increasing CI throughput), or reduce the number of CPUs and/or memory we request (reducing cost).
My hunch would be that we should first try to make more use of the machines we currently request, e.g., by testing behavior of a build without ccache.