Skip to content

Bugs and issues running oq on a cluster #10377

Open
@schmidni

Description

@schmidni

Dear OQ team. I am running tests with OpenQuake on a cluster (CINECA currently). I have some feedback, and ran into some issues. First, for reproducibility my setup:

They preferred not installing OQ as a module. I am using a "spack" environment, in which I added and installed python and pip. Using the pip from this environment, I then installed OpenQuake by hand (install the correct requirements file and then the engine as an editable install). Installing OQ in this way works very well for me and allows me the most flexibility.

Bug No 1:
Possibly not even cluster related. I had a openquake.cfg in my users home directory. I however expected the oq-engine/openquake/engine/openquake.cfg would be used. In this specific case, most of the configurations were in fact taken from the oq-engine/... configuration file (like slurm_time), however the submit_cmd was taken from the openquake.cfg in my user home!
Running the openquake command using -c CONFIG_FILE_PATH did not have any effect.

Bug No 2:
openquake/baselib/slurm.py, line 7, splits the submit_cmd using spaces, which results in flags like this --account=my_account to be ignored.

Issue:
Passing slurm options to the openquake calculations is currently very cumbersome. A few have their own keys in the cfg file, like time and cpus-per-task, the amount of nodes is passed directly to oq engine --run, and all other flags must be added to the submit_cmd in the cfg file. This is very inflexible!! Especially because some flags need to be adapted from run to run, and then I have to go every time to the openquake.cfg file for that!

Is there a way I could simply run openquake using srun or at least create my own sbatch file and run openquake this way? If not I would very much like such an option!

Activity

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions