Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
21 changes: 21 additions & 0 deletions scientific-skills/pacsomatic/LICENSE
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
MIT License

Copyright (c) 2026 Beifang Niu

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
18 changes: 18 additions & 0 deletions scientific-skills/pacsomatic/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
# pacsomatic skill

This directory packages the nf-core/pacsomatic operator skill.

## Layout

- `SKILL.md`: skill contract and usage guidance
- `config.yaml`: baseline defaults for runtime/execution/validation behavior
- `references/`: operator guidance and troubleshooting notes
- `scripts/run_pacsomatic.py`: execution helper
- `tests/`: unit tests for helper behavior

## Run tests

```bash
cd pacsomatic
python -m unittest discover -s tests -v
```
151 changes: 151 additions & 0 deletions scientific-skills/pacsomatic/SKILL.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,151 @@
---
name: pacsomatic
description: Operator toolkit for nf-core/pacsomatic matched tumor-normal workflows from BAM inputs. Use this skill when the user needs to validate run inputs, generate pacsomatic-compliant samplesheets, prepare reproducible Nextflow launch artifacts, run locally or submit to schedulers (LSF/Slurm/PBS/SGE), and triage execution failures. Triggers on requests to run pacsomatic, prepare launch commands/scripts, perform dry-run checks, or troubleshoot pipeline startup and scheduler submission errors.
license: MIT
metadata:
skill-author: Beifang Niu
contributors:
- Haidong
- Wenchao
upstream-pipeline: https://github.com/nf-core/pacsomatic
---

# pacsomatic

## Overview

This skill provides a reproducible execution workflow for nf-core/pacsomatic, centered on a single helper entrypoint that handles validation, artifact generation, and optional execution.

Primary entrypoint:
- `scripts/run_pacsomatic.py`

The helper script:
- validates required identifiers, files, reference mode, and runtime prerequisites
- writes a pacsomatic-compatible samplesheet (`patient,sample,status,bam,pbi`)
- generates a params YAML and launch script for reproducible reruns
- supports dry-run validation and run/submit execution paths

Use this skill as the default path for pacsomatic operations. Do not bypass it with manually assembled `nextflow run nf-core/pacsomatic` commands unless the user explicitly asks for manual command construction.

## When to Use This Skill

Invoke this skill when the user asks to:
- run matched tumor-normal analysis from BAM files
- generate or fix pacsomatic samplesheet and launch artifacts
- execute locally or submit to schedulers (LSF/Slurm/PBS/SGE)
- perform dry-run validation before execution
- troubleshoot launch failures or summarize run outputs

Do not use this skill for:
- deep biological interpretation beyond run-level sanity checks
- editing pipeline internals unless explicitly requested

Typical trigger phrases:
- "run nf-core/pacsomatic for this tumor-normal pair"
- "prepare pacsomatic samplesheet and launch script"
- "do a dry run first and tell me what is missing"
- "submit pacsomatic to slurm/lsf and return the job id"
- "why did pacsomatic submission fail"

## Routing and Execution Rules

1. Always collect required run inputs first.
2. Always route through `scripts/run_pacsomatic.py` for validation and artifact generation.
3. Default to `--dry-run` when the user asks for checks/validation only.
4. Use `--run` only when the user asks to execute/submit.
5. For scheduler modes, include executor-specific resource arguments and return detected job ID when available.
6. If execution fails, report first failure point and next triage target (`.nextflow.log`, `pipeline_info`, failing task logs).

## Inputs Required

Required:
- tumor BAM path
- normal BAM path
- patient ID
- tumor sample ID
- normal sample ID
- output directory
- exactly one reference mode: `--fasta` or `--genome`

Optional:
- profile, resources, scheduler account/queue
- pipeline version (`-r`)
- params file, resume/report/dag flags
- `--dry-run` and/or `--run`

## Workflow

1. Validate identity and input constraints.
2. Validate required local paths (BAM, optional PBI, optional FASTA).
3. Resolve runtime and dependency checks.
4. Build samplesheet and generated params YAML.
5. Generate launch script for selected executor.
6. If `--dry-run` and not `--run`, stop after artifact generation.
7. If `--run`, execute locally or submit to scheduler.
8. Return command/script path, validation status, and job ID (if detected).

## Agent Response Contract

Every response after invocation should include:
- exact command used or generated script path
- confirmation that validation checks ran
- run type (`dry-run` vs `run`)
- scheduler job ID when available
- one concrete next step for validation/triage

## Quick Start

Dry run:

```bash
python scripts/run_pacsomatic.py \
--tumor-bam /path/to/tumor.bam \
--normal-bam /path/to/normal.bam \
--patient-id P001 \
--tumor-sample-id P001_T \
--normal-sample-id P001_N \
--outdir /path/to/output \
--genome GRCh38 \
--profile singularity,sanger \
--dry-run
```

Scheduler execution example (Slurm):

```bash
python scripts/run_pacsomatic.py \
--tumor-bam /path/to/tumor.bam \
--normal-bam /path/to/normal.bam \
--patient-id P001 \
--tumor-sample-id P001_T \
--normal-sample-id P001_N \
--outdir /path/to/output \
--genome GRCh38 \
--profile singularity,sanger \
--executor slurm \
--queue compute \
--project my_account \
--cpus 16 \
--memory-gb 64 \
--walltime 48:00 \
--run
```

## Configuration

Use `config.yaml` as the baseline for profile/executor/runtime defaults. Override at invocation time when user requirements differ.

## Testing

Run unit tests from skill root:

```bash
python -m unittest discover -s tests -v
```

## References

- `references/agent-playbook.md`
- `references/config-and-output.md`
- `references/pacsomatic_guide.md`
- `scripts/run_pacsomatic.py`
42 changes: 42 additions & 0 deletions scientific-skills/pacsomatic/config.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
# Baseline configuration for nf-core/pacsomatic skill usage.
# These values are intended as defaults for operators and agents.

pipeline:
name: nf-core/pacsomatic
repo_url: https://github.com/nf-core/pacsomatic.git
default_version: ""

runtime:
conda_env: pacsomatic-nextflow
use_current_path: false
create_conda_env: false
conda_env_file: ""

execution:
profile: singularity,sanger
executor: local # local | none | lsf | slurm | pbs | sge
cpus: 16
memory_gb: 64
walltime: "48:00"
queue: ""
project: ""
job_name: pacsomatic

outputs:
samplesheet_name: samplesheet.csv
generated_params_name: pacsomatic.params.generated.yaml
launch_script_suffix_local: local
default_workdir_name: work

validation:
enforce_no_spaces_in_ids: true
require_exactly_one_reference_mode: true
warn_if_bam_index_missing: true
min_java_major: 17

agent_response:
include_command_or_script_path: true
include_validation_confirmation: true
include_run_type: true
include_job_id_if_available: true
include_next_triage_step: true
73 changes: 73 additions & 0 deletions scientific-skills/pacsomatic/references/agent-playbook.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,73 @@
# Agent Playbook

## What The User Needs To Provide

Users only need to provide run inputs; they do not need to know pipeline internals:

- tumor BAM path
- normal BAM path
- reference input (`--fasta` path or `--genome` key)
- output directory

Optional:

- sample metadata IDs
- executor/resource preferences
- optional `pbi` paths

No repository checkout directory is required for this skill.

Use this sequence when helping a user run nf-core/pacsomatic:

1. Collect required inputs.
- tumor BAM
- normal BAM
- patient ID
- tumor sample ID
- normal sample ID
- output directory
- one reference mode: `--fasta` or `--genome`
2. Validate naming rules.
Patient/sample identifiers cannot contain spaces.
3. Validate local input paths.
Local BAMs must exist. `pbi` is optional, but if provided the file must exist.
4. Start with a dry run when uncertain.
Use `--dry-run` to validate assumptions and generate artifacts without scheduling.
5. Launch when requested.
Use `--run` with a selected `--executor` (`local`, `lsf`, `slurm`, `pbs`, or `sge`).
6. After submission.
Report generated samplesheet path, script path, printed run command, and detected job ID if present.
7. If pipeline fails later.
Inspect launcher logs first, then Nextflow report and DAG outputs.

Recommended helper command:

```bash
python .github/skills/pacsomatic/scripts/run_pacsomatic.py \
--tumor-bam /path/to/tumor.bam \
--normal-bam /path/to/normal.bam \
--patient-id P001 \
--tumor-sample-id P001_T \
--normal-sample-id P001_N \
--outdir /path/to/output \
--genome GRCh38 \
--profile singularity,sanger \
--executor local \
--dry-run
```

Launch command variant:

```bash
python .github/skills/pacsomatic/scripts/run_pacsomatic.py \
--tumor-bam /path/to/tumor.bam \
--normal-bam /path/to/normal.bam \
--patient-id P001 \
--tumor-sample-id P001_T \
--normal-sample-id P001_N \
--outdir /path/to/output \
--fasta /path/to/reference.fa \
--profile singularity,sanger \
--executor lsf \
--run
```
Loading