-
Notifications
You must be signed in to change notification settings - Fork 28
Description
While reducing the alignment sizes of my current dataset in order to be able to compute couplings on the GPU, I noticed a large discrepancy between results from the formula in the README and the actual RAM needed when running CCMpred.
I know that CCMpred is no longer actively maintained, but in order to help fellow researches running into the same issue, here is the corrected formula based on the calculation in the source code (ccmpred.c, lines 437-441):
Padded: 4* (4* (L * L * 32 * 21 + L * 20) + N * L * 2 + N * L * 32 + N) + 2 * N * L
Unpadded: 4* (4* (L * L * 21 * 21 + L * 20) + N * L * 2 + N * L * 21 + N) + 2 * N * L
The internal size_t mem_needed is however only used for the output part, the actual allocation happens separately for a variety of different memory blocks. I'll do some further testing with samples calculated to barely fit into GPU memory to see if the CUDA allocations are equivalent.