Fix data store missing info for submit-failed jobs#6926
Conversation
cylc/flow/task_job_mgr.py
Outdated
| self.bad_hosts -= exc.hosts_consumed | ||
| self._set_retry_timers(itask, rtconfig) | ||
| # Provide dummy platform otherwise it will show as localhost | ||
| # in the data store: |
There was a problem hiding this comment.
Context easily lost, worth linking this in:
| # in the data store: | |
| # in the data store: | |
| # see https://github.com/cylc/cylc-flow/pull/6926 |
Remove spurious submit-failed dummy job in task proxy job list
991d7ee to
d1a56d4
Compare
| # ... but either way update the job ID in the job proxy (it only | ||
| # comes in via the submission message). |
There was a problem hiding this comment.
Now it is done by data_store_mgr.insert_job()
| "time_submit_exit": event_time, | ||
| "submit_status": 1, | ||
| }) | ||
| itask.summary['submit_method_id'] = None |
There was a problem hiding this comment.
Don't know why the job ID was being wiped on submit-failure
There was a problem hiding this comment.
I don't know either, but there may be a reason, if not part of the bugfix, plz bump.
There was a problem hiding this comment.
Actually, is this the job ID or the job submit method?
There was a problem hiding this comment.
Job ID in the job runner, it's part 3 of the bugfix
| self, | ||
| name: str, | ||
| cycle_point: Union['PointBase', str], | ||
| itask: 'TaskProxy', |
There was a problem hiding this comment.
Please try to avoid passing itask objects to the data store where possible.
We have had to do this in a couple of places, but we don't need to update the remaining interfaces to match.
In theory, we are supposed to be able to populate the data store out of the data base (without the Scheduler or its runtime objects, e.g. TaskProxy) so we can provide offline data.
In truth that isn't possible right now, but we should try to reduce the pain of refactor when the time comes.
| execution_time_limit=job_conf.get('execution_time_limit'), | ||
| platform=job_conf['platform']['name'], | ||
| job_runner_name=job_conf.get('job_runner_name'), | ||
| job_id=itask.summary.get('submit_method_id'), |
There was a problem hiding this comment.
Note, there is no Job ID for a submission failure caused by a platform lookup error because no job submission was made.
There was a problem hiding this comment.
.get() will return None in this case, which is fine
Check List
CONTRIBUTING.mdand added my name as a Code Contributor.?.?.xbranch.Footnotes
Submitted time in this case is really the time of failed submission, but useful to know. It is recorded in the DB but wasn't showing in the UI. ↩