Check pointing / in progress results #734
Unanswered
parcbioinfo
asked this question in
Q&A
Replies: 1 comment
-
I guess this was a long way of asking if there are any non-blocking map-reduce functions, which has been discussed before over at futureverse/future.apply#44, so that answers that question. I'll leave this up here for the time being in case anyone has some general advise as to best practice for checkpointing/in-progress results saving. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Hello all - first off thank you so much for the future framework, it has made sharing code and collaborating within my lab much easier.
I was curious if there was an easy way to save checkpoints or in-progress results, generally, using any of the map-reduce APIs?
My current use case is an operation which takes ~15 minutes per gene, across ~20,000 genes. This task is not quite important enough to throw onto the formal cluster, so I am using shared resources and the ability to restart/grab partial results would be highly useful.
I do understand I could just do this manually with many smaller calls to the map-reduce APIs, but this would lead to a lot of workers sitting idle at the end of each call, which will make a significant difference over the multiple days this is looking to take to run. (something like 10 additional hours to run, if I assume I lose on average 7m of runtime per worker at the end of a chunk, and we save progress 100 times (every 1%))
Right now I'm doing this:
This is working fine, but I can't help but feel like I am reinventing the wheel. No doubt this process is also maximizing the overhead costs by starting the absolute maximum number of futures, which feels bad. I can say that some simulations have the above code running faster than 100 manual map-reduces, but slower than just running a single map-reduce.
Any advise would be welcome. Thank you!
Beta Was this translation helpful? Give feedback.
All reactions