sdrf-skills

Turn Claude Code, Cursor, OpenAI Codex, Gemini CLI, or OpenCode into an expert proteomics SDRF annotator.

Pick a dataset → The agent fetches PRIDE + paper → You review a validated SDRF.

Structured skills that give AI assistants expert-level capabilities for annotating, validating, improving, and brainstorming proteomics metadata in the SDRF format.

Workflow

     SETUP             PLAN             ANNOTATE          VALIDATE           REFINE             SHARE
 ┌──────────┐     ┌──────────┐     ┌──────────┐     ┌──────────┐     ┌──────────┐     ┌──────────┐
 │  Conda   │     │ Templates│     │   PXD    │     │ Columns  │     │  Score   │     │ Convert  │
 │   Pip    │────▶│ Strategy │────▶│  PRIDE   │────▶│   OLS    │────▶│  AutoFix │────▶│   PR     │
 │  Tools   │     │  Layers  │     │  Paper   │     │  Rules   │     │ Raw scan │     │ Pipeline │
 └──────────┘     └──────────┘     └──────────┘     └──────────┘     └──────────┘     └──────────┘
  /sdrf:setup   /sdrf:brainstorm   /sdrf:annotate   /sdrf:validate   /sdrf:improve   /sdrf:contribute
                /sdrf:templates                                      /sdrf:fix         /sdrf:convert
                                                                     /sdrf:review
                                                                     /sdrf:techrefine

                  ┌──────────┐     ┌──────────┐     ┌──────────┐     ┌──────────┐
                  │  Format  │     │ Ontology │     │  Plain   │     │  Batch   │
                  │   Spec   │     │  Lookup  │     │   Lang   │     │ Confound │
                  │  Rules   │     │  Verify  │     │ Concepts │     │ Replic.  │
                  └──────────┘     └──────────┘     └──────────┘     └──────────┘
                 /sdrf:knowledge   /sdrf:terms     /sdrf:explain     /sdrf:design

What it does

Instead of an AI guessing at ontology terms or SDRF rules, these skills teach it exactly how to annotate proteomics datasets — using real tools (OLS, PRIDE, PubMed) guided by the methodology of experienced annotators.

The SDRF specification data (column definitions, templates) lives in a git submodule and is read at runtime — so the skills stay current when the spec evolves.

Available skills

All 16 skills are under the sdrf: namespace. In Claude Code, type /sdrf: and autocomplete will show them all.

Skill	What it does
`/sdrf:setup`	Install dependencies (parse_sdrf, techsdrf) — conda or pip guided setup
`/sdrf:knowledge`	Ask about SDRF format, column rules, ontology mappings, reserved words
`/sdrf:templates`	Ask about templates, select templates, understand layers and selection rules
`/sdrf:annotate`	Full annotation workflow: PXD → PRIDE + paper → draft SDRF → validate
`/sdrf:validate`	Systematic validation against templates + ontology checking via OLS
`/sdrf:improve`	Quality analysis: specificity, completeness, consistency, score
`/sdrf:fix`	Auto-fix common errors (UNIMOD swaps, case, format, artifacts)
`/sdrf:terms`	Find and verify ontology terms for any SDRF column
`/sdrf:brainstorm`	Plan metadata strategy before creating an SDRF
`/sdrf:review`	Comprehensive quality review with cross-reference to paper + PRIDE
`/sdrf:explain`	Explain any column, error, or concept in plain language
`/sdrf:convert`	Choose and configure analysis pipelines from SDRF
`/sdrf:design`	Detect batch effects, confounders, replication issues
`/sdrf:contribute`	Contribute annotated SDRF back to sdrf-annotated-datasets via PR
`/sdrf:techrefine`	Verify/refine technical metadata from raw files via techsdrf
`/sdrf:cellline`	Translate Cellosaurus records into SDRF cell-line columns (organism, disease, sampling site, sex, ancestry)

Installation

1. Clone with submodules (required)

The SDRF specification data is included as a git submodule. You must initialize it:

# Clone with submodules:
git clone --recurse-submodules https://github.com/bigbio/sdrf-skills

# Or if already cloned without submodules:
cd sdrf-skills
git submodule update --init --recursive

To update the spec to the latest version:

git submodule update --remote --recursive

2. Install dependencies (recommended)

Install the deterministic helper tools used by the skills. Conda is recommended (includes thermorawfileparser for Thermo .raw files):

# Recommended (conda):
conda env create -f environment.yml
conda activate sdrf-skills

# Or pip:
pip install -r requirements.txt

For Thermo .raw files, thermorawfileparser is not on PyPI — use conda: conda install -c bioconda thermorawfileparser.