Ignore 'time_minutes' when caching jobs

I understand the rationale for making all of the inputs (inputs and runtime configuration) part of the cache key. It makes sense, a lot of the runtime parameters directly affect computation results (cpu, mem, but mostly `docker`) But it would be nice if `time_minutes` was not included in that because it doesn't affect the computations in the same way as the others, it just puts a limit, computation would proceed identically given different values (iff larger than the required time.)

## Background

On our SLURM cluster, jobs require a timeout which we set as `time_minutes` in https://github.com/miniwdl-ext/miniwdl-slurm/

This has been working perfectly, but now I'm running someone else's workflow, which doesn't annotate time minutes (nor uses any formulae for calculating them per job). I've been guessing at time_minutes values but sometimes I get it wrong. Or I encounter new datasets which require longer to run with specific tools.

I've now increased that in my workflow of 20 datasets, but as a result, that's busted the job cache, and now I'm going to burn some weeks of CPU time re-doing all of the previously successfully completed datasets, because one tool in the middle needs longer to finish.

If `time_minutes` could be excluded from the cache that would fix my issue, though I'm sure I'd need to re-run it end to end at least once to take advantage of that, but that's fine.

cc @rhpvorderman 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ignore 'time_minutes' when caching jobs #823

Background

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Ignore 'time_minutes' when caching jobs #823

Description

Background

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions