sanjaysgk/ipg

Introduction

sanjaysgk/ipg is a bioinformatics pipeline for immunopeptidogenomics: it builds a personalised cryptic peptide search database from RNA-seq, then searches it against immunopeptidomics MS/MS data to identify non-canonical (cryptic) peptides. It implements the method of Scull et al. (2021) as a reproducible nf-core-style Nextflow pipeline.

The pipeline runs in independent steps selected with --step:

--step db_construct — RNA-seq → cryptic peptide FASTA

Align reads with two-pass STAR and infer strandedness with RSeQC.
Assemble transcripts with StringTie and reconcile with the reference annotation via gffcompare.
GATK4 RNA-seq best-practice BAM preparation (MarkDuplicates → SplitNCigarReads → two-pass BQSR).
Call somatic variants with Mutect2 in tumour-only mode.
Build the cryptic peptide database with the IPG custom C tools (curate_vcf, alt_liftover, triple_translate, squish).

--step ms_search — MS/MS → identified cryptic peptides

Search each sample's spectra against its cryptic database with MSFragger, Comet and Sage.
Rescore PSMs with MS2Rescore + mokapot FDR and integrate engines at a configurable peptide-level FDR (default 1%).
Optional de novo discovery lane (--run_denovo, InstaNovo) — predicts peptides directly from spectra and classifies them canonical / cryptic / novel.
Optional immunoinformatics (HLA binding, motif clustering, quantification) and a cryptic-discovery report.

Usage

Note

New to Nextflow? See the nf-core installation docs. The repository ships a pixi environment that pins every tool — install it with pixi install (curl -fsSL https://pixi.sh/install.sh | bash if you don't have pixi).

Prepare a samplesheet:

samplesheet.csv

sample,fastq_1,fastq_2,strandedness
SAMPLE,/path/to/R1.fastq.gz,/path/to/R2.fastq.gz,reverse

Build the cryptic peptide database:

pixi run nextflow run . \
    -profile singularity \
    --step db_construct \
    --input samplesheet.csv \
    --outdir results \
    -params-file reference.yaml

Warning

Provide parameters via the CLI or a -params-file, not via a custom -c config file.

To try the pipeline on the bundled chr22 test data, run with -profile test,pixi. For the full reference-genome parameters, the MS-search samplesheet, the --step ms_search and --step post_ms workflows, and all options, see docs/usage.md.

Pipeline output

Database construction: results/db_construct/<sample>/<sample>_cryptic.fasta
MS search: the integrated peptide table under results/ms_search/<sample>/
A MultiQC report and Nextflow execution reports under results/pipeline_info/

See docs/output.md for the full output description.

Profiles

Profile	Purpose
`pixi`	Run every tool from the local pixi env (no containers)
`singularity` / `docker`	Pull biocontainers (HPC / cloud)
`monash`	SLURM on the Monash M3 `comp` partition (`xy86` account)
`test`	Use the bundled chr22 test data

Credits

sanjaysgk/ipg was written by Sanjay SG Krishna (@sanjaysgk), Li Lab, Monash University, porting the immunopeptidogenomics method and custom C tools developed by Kate Scull (Purcell Lab; kescull/immunopeptidogenomics). Supervised by Chen Li (Li Lab) and Anthony W. Purcell (Purcell Lab), Monash University.

Contributions and support

Contributions and bug reports are welcome — please open a GitHub issue or a pull request.

Citations

If you use sanjaysgk/ipg, please cite the method paper:

Scull KE, Pandey K, Ramarathinam SH, Purcell AW. Immunopeptidogenomics: harnessing RNA-seq to illuminate the dark immunopeptidome. Mol Cell Proteomics. 2021;20:100143. doi:10.1016/j.mcpro.2021.100143

A reference list for every tool in the pipeline is in CITATIONS.md. This pipeline is built with Nextflow and the nf-core framework (Ewels et al., Nat Biotechnol. 2020, doi:10.1038/s41587-020-0439-x).

License

MIT — see LICENSE.

Name		Name	Last commit message	Last commit date
Latest commit History 352 Commits
.devcontainer		.devcontainer
.github		.github
.vscode		.vscode
assets		assets
bin		bin
conf		conf
containers/ipg-tools		containers/ipg-tools
docs		docs
modules		modules
scripts		scripts
subworkflows		subworkflows
tests		tests
workflows		workflows
.editorconfig		.editorconfig
.gitattributes		.gitattributes
.gitignore		.gitignore
.gitpod.yml		.gitpod.yml
.nf-core.yml		.nf-core.yml
.pre-commit-config.yaml		.pre-commit-config.yaml
.prettierignore		.prettierignore
.prettierrc.yml		.prettierrc.yml
CHANGELOG.md		CHANGELOG.md
CITATIONS.md		CITATIONS.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
LICENSE		LICENSE
README.md		README.md
main.nf		main.nf
modules.json		modules.json
nextflow.config		nextflow.config
nextflow_schema.json		nextflow_schema.json
nf-test.config		nf-test.config
pixi.lock		pixi.lock
pixi.toml		pixi.toml
ro-crate-metadata.json		ro-crate-metadata.json
tower.yml		tower.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

sanjaysgk/ipg

Introduction

Usage

Pipeline output

Profiles

Credits

Contributions and support

Citations

License

About

Uh oh!

Releases 2

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

sanjaysgk/ipg

Introduction

Usage

Pipeline output

Profiles

Credits

Contributions and support

Citations

License

About

Topics

Resources

License

Code of conduct

Contributing

Uh oh!

Stars

Watchers

Forks

Releases 2

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages