How to run test in parallel and GPU acceleration? #1277
-
|
Hi all, I am new in using QUOKKA code and I feel sorry for my naive question. I follow the steps in "installation" part in the documentation, but it seems running "ninja test" should lead to only use one CPU core to do all the tests, because in the output file, run time is almost equal to CPU time. I compile the code in 1D and CPU only, my script is as follows: Now I have compiled the code in 3D and with GPU enabled. I want to use one node on our supercomputer (including 24 cpu cores and 4 V100 cards) to run the test. Could you please tell me how to do? Yours Sincerely, |
Beta Was this translation helpful? Give feedback.
Replies: 3 comments 12 replies
-
|
Hi @tianninglyu! I'm Dosu and I’m helping the quokka team. To run Quokka tests in parallel, use If you want to specify which GPUs to use, you can set the export CUDA_VISIBLE_DEVICES=0,1,2,3
ctest -j24This will use all 24 CPU cores and all 4 V100 GPUs on your node. To reply, just mention @dosu. How did I do? Good | Irrelevant | Incorrect | Verbose | Hallucination | Report 🐛 | Other |
Beta Was this translation helpful? Give feedback.
-
|
Hi @tianninglyu, just to add a bit to what the bot said: instructions for how to run on GPUs on a supercomputer tend to be a bit machine-specific, but you can see example scripts to run on GPU for a large number of systems in the |
Beta Was this translation helpful? Give feedback.
-
|
Sorry one more question. Is it normal to run out of memory when running ctest with 3D gpu? |
Beta Was this translation helpful? Give feedback.
OK, in that case you don't want to use
ctest. Instead, the best way to accomplish this is to just to run a single problem on different numbers of GPUs. You can find examples of scripts that do exactly this sort of testing in thescriptsdirectory -- for example, the series of scriptsscripts/slurm/frontier-1node.submit,frontier-8node.submit,frontier-64node.submitare intended to do a weak scaling test where we run the exact same problem (in this case thetest_hydro3d_blastproblem) using a 512^3 grid on one node, a 1024^3 grid on 8 nodes (so work per node is fixed), a 2048^3 grid on 64 nodes, etc. If you want to do speed testing, you should follow this pattern.