-
Notifications
You must be signed in to change notification settings - Fork 37
Description
WES: Lack of standard for handling nested files and paths in workflow_attachment
Currently, the GA4GH WES specification allows users to attach additional files (e.g., scripts, config files) via the workflow_attachment
field in a run request. However, this mechanism lacks any formal structure for referencing nested files, file paths, or remote locations (e.g., cloud buckets or TRS URIs). This results in inconsistencies and implementation-specific behavior.
💡 Example
Let's say I want to run a workflow referenced by a TRS URI like:
trs://trs.example.org/my-wf/1.0.0/my-entry.nf
This workflow includes multiple files organized like this:
main.nf
modules/
└── helper.nf
configs/
└── params.config
To run this using WES, I’d have to:
- Resolve the TRS URI,
- Download all files manually,
- Attach them via
workflow_attachment
.
The problem: there’s no way for WES to understand or preserve the relative structure of these files. Unlike TES, which defines structured input/output fields (e.g., path
, type
), WES provides no such mechanism. This makes it difficult to:
- Maintain correct relative paths in the execution environment
- Attach complex workflows that depend on nested files
- Use cloud-native references (e.g., S3/GS/HTTPS URLs) instead of raw file uploads
✅ Desired Behavior
WES should define a standard mechanism for:
- Preserving file paths in
workflow_attachment
(e.g.,modules/helper.nf
) - Supporting directory structures
- Accepting remote URIs in addition to raw file uploads
This will make it easier to run workflows from TRS directly and support cloud-native execution patterns more consistently across implementations.