Production-ready Common Workflow Language workflows designed for AI-agent execution via the GA4GH Workflow Execution Service (WES) API.
pa-cwl provides a curated collection of CWL v1.2 workflows for scientific data analysis. Each workflow ships with:
agent.yaml— Machine-readable instructions for AI agents: what the workflow does, what inputs it needs, how to run it- CWL v1.2 workflows — Portable, standards-compliant workflow definitions
- WES-ready execution — Tested with sapporo-wes and validated via yevis-cli
- Workflow Run RO-Crate — Provenance records for every validated execution
All 16 pipelines implemented and tested. Functional specifications derived from nf-core pipelines, rewritten as idiomatic CWL v1.2 (not transpiled).
| Workflow | Description | Key Tools |
|---|---|---|
| fetchngs | Fetch FASTQ from public repositories (SRA/ENA/DDBJ) | ENA API, fasterq-dump |
| Workflow | Description | Key Tools |
|---|---|---|
| rnaseq | RNA-seq quantification (4 pathways) | STAR, HISAT2, Salmon, RSEM, kallisto |
| scrnaseq | Single-cell RNA-seq (10x, Drop-seq, Smart-seq2) | STARsolo, Alevin-Fry, Kallisto/BUStools |
| rnafusion | Gene fusion detection | STAR-Fusion, Arriba, FusionCatcher, FusionInspector |
| Workflow | Description | Key Tools |
|---|---|---|
| chipseq | ChIP-seq peak calling | BWA-MEM2, MACS2, deepTools |
| atacseq | ATAC-seq chromatin accessibility | BWA-MEM2, MACS2 (--nomodel) |
| methylseq | Bisulfite-seq methylation (RRBS) | Bismark, bwa-meth |
| cutandrun | CUT&RUN/CUT&TAG peak calling | Bowtie2, MACS2, SEACR, spike-in normalization |
| Workflow | Description | Key Tools |
|---|---|---|
| sarek | Germline + somatic variant calling | BWA-MEM2, GATK4 (HC, Mutect2, BQSR), VEP |
| raredisease | Rare disease variant annotation | sarek + VEP, DeepVariant, Manta, GENMOD |
| viralrecon | Viral variant calling and consensus | BWA-MEM2, iVar, bcftools, Pangolin, Nextclade |
| Workflow | Description | Key Tools |
|---|---|---|
| ampliseq | 16S/ITS amplicon sequencing (PE+SE) | Cutadapt, DADA2, QIIME2 |
| mag | Metagenome-assembled genomes | SPAdes, MetaBAT2, MaxBin2, DAS Tool, BUSCO, GTDB-Tk |
| taxprofiler | Taxonomic profiling | Kraken2, Bracken, MetaPhlAn, Centrifuge, Krona |
| Workflow | Description | Key Tools |
|---|---|---|
| nanoseq | Nanopore long-read sequencing | minimap2, NanoPlot, medaka, Sniffles2, StringTie2 |
| hic | Hi-C chromatin conformation | Bowtie2, pairtools, cooler, HiCExplorer, cooltools |
124 CWL tools in tools/, shared across pipelines. See pipeline roadmap for detailed feature tables and test matrices.
Start with AGENTS.md — the top-level guide for AI agents. It provides the workflow catalog, WES API essentials, and provenance protocol.
Each workflow's agent.yaml contains the detailed execution plan, input schema with resolution strategies, and resource requirements.
# Run locally with cwltool
cwltool workflows/rnaseq/main.cwl workflows/rnaseq/examples/star-salmon.yaml
# Run via sapporo-wes
# See docs/running-with-wes.md- Phase 1 — 16 core pipelines (fetchngs through hic) — Complete
- Phase 2 — v1.1 enhancements — Complete (44 features across 12 pipelines)
- Phase 2.5 — v2.0 sarek somatic calling (Mutect2) + VEP annotation — Complete
- Phase 3 — Agent guide (AGENTS.md) — Complete