RFC fair-share prioritisation of tape-to-disk user data staging requests #237
tiborsimko
started this conversation in
Ideas
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
When a lot of CERN Open Data portal's content will live on tapes as the primary storage, and users will be able to consult a few sample files that are living on disk and then ask for staging the remaining majority of files from tapes, it may happen that many persons will sent numerous data staging requests.
To cope with situations of user request storms of this kind, it would be useful to foresee several fair-share staging strategies on how the incoming requests will be handled from the incoming queue.
For inspiration, in REANA, sometimes a user submits several thousands of workflows, which may be blocking the queue for days. We have introduced different workflow scheduling strategies that the REANA administrator can set in order to alleviate these concerns. We basically have a simple
fifostrategy (first in, first out) and a fair-share-orientedbalancedstrategy. The latter is a combination of (i) paying attention to the number of requests submitted by each user (so that if Alice submits 20, and Bob 1, the latter will be taken in priority when the queue liberates), but also (ii) paying attention to the capabilities of the cluster at the given time (so that if Alice asks for 1 GB of RAM only, and Bob for 120 GB of RAM, and the cluster currently has only 24 GB free, then Alice's request will be prioritised).In the context of the CERN Open Data portal and the tape-to-disk user staging requests, one could think of something similar to achieve fair-sharing, prioritising over users and over content and over technical capabilities of the system. For example:
Just some examples to stimulate discussion.
Beta Was this translation helpful? Give feedback.
All reactions