-
Notifications
You must be signed in to change notification settings - Fork 17
Description
Copied over from aws-deadline/deadline-cloud#447
Original submitter: https://github.com/Mathieson
I need my task space to be a subset of strings in a larger string array for my project. This is not currently supported, but I've found a solution that I am pretty happy with, and more official support to streamline it would be fantastic. Here is what I've done.
I've created a folder dedicated to hosting many text files. Each text file contains a slice of the list of strings I want processed. For example...
- my_job_bundle
- my_task_values
- 0.txt
- 1.txt
- 2.txt
- my_task_values
In my template.yaml, I've defined a top-level param to specify the task_count, which will determine how many of these files I will process.
- name: MyTaskCount
type: INTMy parameter space defines a simple range...
parameterSpace:
taskParameterDefinitions:
- name: TaskFileIndex
type: INT
range: "1-{{Param.MyTaskCount}}"Then, in my script, I use Task.Param.TaskFileIndex to pull the appropriate file from that directory for the task, read its contents, and continue processing as normal.
I currently use Deadline Cloud with the Python submitter, so this workflow allows me to generate those task files dynamically in my submission script. This is extra awesome because it means I can enlist the help of more-itertools to create my ranges incredibly easily. As for local OpenJD workflow, it is also easy to generate and keep some of these files in my working area.
My hope is that this approach will become streamlined through official support, and we might be able to specify something such as...
parameterSpace:
taskParameterDefinitions:
- name: TaskFile
type: DirectoryContents
range: "{{Param.MyTaskDirectory}}"It would give you a path for each file found within the specified directory, or even...
parameterSpace:
taskParameterDefinitions:
- name: Whatever
type: TaskFileContents
- name: Another
type: TaskFileContents
combination: "({Task.Param.Whatever},{Task.Param.Another})"It could go so far as to give you the values found within said file and implicitly determine the file via a predictable folder structure matching up with the step name and taskParameter name. For example...
- my_job_bundle
- task_files
- step_name
- Whatever
- 0.txt
- 1.txt
- 2.txt
- Another
- 0.txt
- 1.txt
- 2.txt
- Whatever
- step_name
- task_files
Interested in hearing others' thoughts. Thanks!