Replies: 4 comments 4 replies
-
|
Hi James! Great question! The immutability thing was something that we talked about a lot. "since it makes for reliable predictability and reproducibility (for scheduling decisions, debug & diagnosis etc.)." -- heavily this, but in combination with an observation that immutable jobs seems to be pretty common in existing (public) render management systems, are what tipped the scales in favour of our proposal. I'll admit to not being as up-to-date regarding modern DCC tools increasingly adding job mutability. Would you mind pointing me at some examples so that I can do some learning? Thanks! |
Beta Was this translation helpful? Give feedback.
-
|
This is a fantastic topic to dig deeper on. One lens we looked at it with, is, assuming we want to support both approaches, which one should we start with? We concluded that starting with expressing the structure of the job ahead of time is where to start, because it's the more constrained approach. By trying to express a variety of jobs that don't quite fit in the schema, we can learn what specific dynamic job structures we might add, and use the specification's RFC process to add incremental features that satisfy the need. Since submitting extensions to a job is not part of the spec, very dynamic self-expanding jobs aren't something it supports in a portable way that's independent of a specific render farm. You can still do this, by including the render farm's own job submission command inside the job. When people do this, those resulting jobs will be useful to build similar abstractions into the spec itself. I hope this approach leads us to a more broadly useful spec, than if we started by trying to include it. |
Beta Was this translation helpful? Give feedback.
-
|
My experience has mostly been around optimizing farm operations and managing teams to scale farm output without scaling costs or manual labor. From that perspective, having deferred evaluation of the full scope of the tasks in a job is risky; +1 to @jvanns notes. A job that keeps unpacking scope after submission: at a high enough level looks like a long running job. As a pattern, it kinda feels more like machine reservation (if the process unpacks locally) than batch processing. It is harder to scope (I feel) than long running immutable scope jobs, because at least the tasks count is finite, and the scheduler can evaluate the maximum number of machines entangled. Depending on how mutable task/parameter scopes unpack, and when in the evaluation (say a previous step) unpacks, the scheduler can get into states that are extremely hard to make any predictions about. (I'm basically saying what @jvanns is saying but worse 😄 and from an operations lens). Very difficult for infra admins to answer "when will it be done?", "is it supposed to take this long or is it sick?", "can this be optimized?" which are key questions to answer for production management. Let's dive a little deeper; @jvanns do you have a list of a few particular DCCs you had in mind when you mentioned that this is seems to be where things are trending? I'd love to dive into their docs about this. Maybe there are suggested job management strategies that help us think about it. At the end of the day, there's nothing stopping a job's instructions from being defined to call the render farm to submit more jobs. But why, I wonder? +1 to @mwiebe and @ddneilson's replies too. I love this thread. Great topic, @jvanns ! |
Beta Was this translation helpful? Give feedback.
-
|
Hi All! Just chiming in about dynamic jobs. We had a great discussion with the OpenJD folks about a month ago regarding this topic and they pointed out that task specifications/templates can be standalone; they do not need to be embedded in a job template. I believe this is all that is needed by the OpenJD spec in order to support dynamic jobs. Standalone task templates means that you can submit your job template first and then at a later time submit a task template to be attached to the already running job. For example, in pseudo-code: So the onus would be on the job scheduling system to add support for adding OpenJD tasks to previously submitted OpenJD jobs (i.e. jobSchedAPI.submitTaskToJob() ) in order to support dynamic jobs. That would be enough to handle jobs submitted by Houdini's PDG framework that was brought up earlier in this discussion. Cheers, |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
I think one of the problems we face with a JDL is the notion of deferred or lazy evaluation of parts of the job not yet realised. I.e. a job that adjusts itself or expands at dispatch time to more jobs or tasks - something we see modern DCC tools and libraries doing. This makes it hard to describe the job fully upfront (and especially its resource requests, dependencies etc.). Personally I'm a fan of immutable jobs post-submission since it makes for reliable predictability and reproducibility (for scheduling decisions, debug & diagnosis etc.). But that's not the way (some) tools are evolving! Have you had any discussions prior to this public RFC around that?
Beta Was this translation helpful? Give feedback.
All reactions