Definition for outputs

### Problem

The current [[WES response schema for outputs](https://github.com/ga4gh/workflow-execution-service-schemas/blob/03c795c0cd890b1b8735f7ad27e546008a007cf2/openapi/workflow_execution_service.openapi.yaml#L824-L826)](https://github.com/ga4gh/workflow-execution-service-schemas/blob/03c795c0cd890b1b8735f7ad27e546008a007cf2/openapi/workflow_execution_service.openapi.yaml#L824-L826) defines outputs simply as an `object`. While this is flexible, it’s also very broad, which risks divergence among client implementations.

I understand the challenge of keeping WES general enough to support all workflow engines, but I believe the specification could benefit from stronger recommendations—or even minimal conventions—for how outputs are represented.

---

### Proposed Output Classification

In practice, I’ve found it useful to categorize outputs into three broad types:

* **`meta`**: Files produced by the engine itself, such as cache files, logs, or system information.
* **`stage`**: Intermediate artifacts needed by the engine or backend to run the workflow (e.g., `.command.run`, `.command.sh`, or cache files that downstream steps depend on).
* **`results`**: The actual outputs of the workflow that are meaningful to the user.

From a user’s perspective, all of these categories matter, but implementations should be smart about filtering out noise (e.g., cache files in `meta` likely don’t need to be returned to clients).

---

### Example Structure

Here’s a simplified version of how I’ve modeled outputs in my own use case:

```yaml
outputs:
  results:
    relative_path_of_my_results_1:
      url: "presigned S3 URL, file path (e.g. file://results/result_1), or other access method"
      checksum:
        type: sha256
        value: "<checksum_value>"
    relative_path_of_my_results_2:
      url: "presigned S3 URL, file path (e.g. file://results/result_2), or other access method"
      checksum:
        type: sha256
        value: "<checksum_value>"
  meta:
    relative_path_of_my_meta_1:
      url: "presigned S3 URL, file path (e.g. file://meta/meta_1), or other access method"
      checksum:
        type: sha256
        value: "<checksum_value>"
```

*This isn’t the exact implementation, but it conveys the idea.*

---

### Performance Considerations

This approach worked well for my use case, but running larger workflows exposed a new problem: returning a large number of results in a single API call can degrade performance.

Possible solutions:

1. **Separate endpoint** – Keep `outputs` lightweight, but add a new endpoint that returns detailed outputs (potentially with streaming support).
2. **Streaming response** – Make the existing `outputs` field itself streamable for large payloads.

---

### Suggestion

I propose that the WES spec:

* Recommends (or defines) a minimal schema for categorizing outputs (`meta`, `stage`, `results`).
* Defines how clients should consume outputs (e.g., URLs, checksums).
* Considers large-output workflows by supporting either a separate or streaming endpoint.

This would reduce ambiguity, improve client interoperability, and help implementers balance usability with performance.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Definition for outputs #228

Problem

Proposed Output Classification

Example Structure

Performance Considerations

Suggestion

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

	outputs:
	type: object
	description: The outputs from the workflow run.

Definition for outputs #228

Description

Problem

Proposed Output Classification

Example Structure

Performance Considerations

Suggestion

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions