Developer Recommended Best Practices #4303
Replies: 2 comments 1 reply
-
|
I think the -1 is ok here, as the issue is at the end of the workflow, thus unlikely to cause workers' disconnection. |
Beta Was this translation helpful? Give feedback.
-
|
Yes, I have a few different sets of logs! Since they're on the larger side, I hope it's okay if I point you to their location on the CRC. This first set of logs is from a run that had 12000 cores and was able to complete with only a few recovery tasks (I think less than 10 total): The second set is from a job that had a similarly large input but had all of its workers evicted at some point and had to entirely recover: The third is a job that went relatively normally, barring the slowdown at the end: Hopefully one or all of these are helpful in some way! In the meantime, I'll give the suggested tune changes a shot to see if they help behavior. As for the cause being long-running tasks, I believe we've seen that there are tasks that run long and stall when we've looked into this previously. The problem is that I haven't been able to successfully associate a long-running task with a particular chunk in order to figure out the problem. When I've tried running one chunk at a time in the past to see if there's any one that causes problems, I've found that individually they all run in similar timeframes. It's only in aggregate that this ending hangtime occurs, so I'm not sure what I'm doing that causes these very long tasks. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Hello!
I was hoping to open a post asking about what the best practices are for setting up a successful (modern) TaskVine workflow.
It's been a while since I brought this up in a meeting, but I do still occasionally run into an issue of TaskVine jobs slowing down quite a bit towards the end of the workflow (i.e. the final tasks on the progress bar take much longer than any previous, despite the output files being of these last tasks being not substantially different than those previous). I could easily believe that this is due to user error, so I wanted to post my current TaskVine "settings" and hopefully get some feedback on them!
I currently open each of my TaskVine applications with the following:
And call compute with the following:
And for reference, I am currently on ndcctools version 7.15.8. Thank you for the thoughts, and let me know what you think!
Beta Was this translation helpful? Give feedback.
All reactions