Skip to content

Conversation

@tdrwenski
Copy link
Collaborator

Improved solution from issue being addressed by #66.

The current issue is if a performance job is canceled in the GitLab CI or hits the GitLab job runner time limit, then the underlying flux or slurm job does not get canceled.

This PR address this by using the Jacamar batch executor instead of shell executor. This way Jacamar itself handles submitting the batch job and cleaning it up if it is cancelled/ hits timeout.

I have tested this for components/ non-components for the case where I canceled a queued job, a running job, and had the job hit the GitLab timeout. In all cases the flux/slurm job was cleaned up automatically.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants