-
Notifications
You must be signed in to change notification settings - Fork 18
Description
Title: Proposal: Structured Workflow File Categories in /files
Endpoint or support .trsignore
Summary
The current /files
endpoint in TRS returns all files associated with a workflow version—including non-essential files like .gitignore
, .svg
, README assets, and more. This leads to unnecessary bloat when trying to retrieve only the core files required to run the workflow.
It would be helpful to have a stadard way to deal with it, of the top of my head:
- A
.trsignore
mechanism (similar to.gitignore
) to exclude irrelevant files. - Structured categorization of files (e.g., core workflow files, test files, and others).
- A streamlined way to download categorized ZIP archives directly.
Problem
Calling the /tools/{id}/versions/{version_id}/files
endpoint often returns all files from the repository or archive, regardless of their relevance to executing the tool/workflow. For example:
[
{
"path": "main.nf",
"file_type": "PRIMARY_DESCRIPTOR"
},
{
"path": ".gitignore",
"file_type": "OTHER"
},
{
"path": "assets/logo.svg",
"file_type": "OTHER"
},
{
"path": "test/test_input.csv",
"file_type": "TEST_FILE"
}
]
From a consumer’s perspective (e.g., a WES client or a CLI tool trying to fetch a runnable workflow), these unrelated files introduce confusion and unnecessary data transfer.
Proposed Improvements
1. .trsignore
File Support
Allow tool authors to include a .trsignore
file in the workflow source (like .gitignore
) to explicitly list patterns of files to exclude from the /files
endpoint.
Example .trsignore
:
.gitignore
assets/
*.svg
docs/
This would give authors control over what gets published as part of the TRS /files
endpoint.
2. Categorize Files in the API
Extend the /files
endpoint or introduce a new one (e.g., /structured-files
) to return files grouped by their usage:
{
"core": [
{ "path": "main.nf", "file_type": "PRIMARY_DESCRIPTOR" },
{ "path": "modules/align.nf", "file_type": "SECONDARY_DESCRIPTOR" }
],
"tests": [
{ "path": "tests/input.csv", "file_type": "TEST_FILE" }
],
"other": [
{ "path": ".gitignore", "file_type": "OTHER" },
{ "path": "assets/logo.svg", "file_type": "OTHER" }
]
}
3. Download Bundles by Category
It would be helpful to support endpoints like:
/tools/.../versions/.../files/core.zip
/tools/.../versions/.../files/tests.zip
/tools/.../versions/.../files/all.zip
This would avoid having to:
- Call
/files
, - Filter files manually, and
- Use
/tools/.../versions/.../files/{path}
N times to fetch them individually.
Benefits
- Reduces bloat when importing workflows into other services (e.g., WES).
- Clarifies workflow structure for both humans and tools.
- Provides a better developer experience for both authors and consumers.
- Sets a standard that aligns with common practices like
.gitignore
.
Related Use Case
When ingesting workflows via TRS, I want to download only the minimal required files. Currently, I have to filter manually or rely on heuristics, which is error-prone.
┆Issue is synchronized with this Jira Story
┆Project Name: Zzz-ARCHIVE GA4GH tool-registry-service
┆Issue Number: TRS-72