Bugs and issues running oq on a cluster

Dear OQ team. I am running tests with OpenQuake on a cluster (CINECA currently). I have some feedback, and ran into some issues. First, for reproducibility my setup:

They preferred not installing OQ as a module. I am using a "spack" environment, in which I added and installed python and pip. Using the pip from this environment, I then installed OpenQuake by hand (install the correct requirements file and then the engine as an editable install). Installing OQ in this way works very well for me and allows me the most flexibility.

Bug No 1:
Possibly not even cluster related. I had a `openquake.cfg` in my users home directory. I however expected the `oq-engine/openquake/engine/openquake.cfg` would be used. In this specific case, most of the configurations were in fact taken from the `oq-engine/...` configuration file (like `slurm_time`), however the `submit_cmd` was taken from the `openquake.cfg` in my user home! 
Running the openquake command using `-c CONFIG_FILE_PATH` did not have any effect.

Bug No 2:
`openquake/baselib/slurm.py`, line 7, splits the `submit_cmd` using spaces, which results in flags like this `--account=my_account` to be ignored.


Issue:
Passing slurm options to the openquake calculations is currently very cumbersome. A few have their own keys in the `cfg` file, like time and cpus-per-task, the amount of nodes is passed directly to oq engine --run, and all other flags must be added to the `submit_cmd` in the `cfg` file. This is very inflexible!! Especially because some flags need to be adapted from run to run, and then I have to go every time to the openquake.cfg file for that!

Is there a way I could simply run openquake using `srun` or at least create my own sbatch file and run openquake this way? If not I would very much like such an option!



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Bugs and issues running oq on a cluster #10377

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Bugs and issues running oq on a cluster #10377

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions