|
2 | 2 |
|
3 | 3 | Pangenome-based sequence placement, alignment, and genotyping. |
4 | 4 |
|
5 | | -Given a pangenome (in [PanMAN](https://github.com/TurakhiaLab/panman) format) and sequencing reads, panmap places reads onto the pangenome tree, aligns them to the closest reference, and calls variants. |
| 5 | +[Documentation](https://amkram.github.io/panmap/) | [Preprint](https://www.biorxiv.org/content/10.64898/2026.03.29.711974v1) |
6 | 6 |
|
7 | | -### Modes |
8 | | - |
9 | | -- **Single-sample** (default): Places reads from a single sample, aligns to the best-matching reference, and genotypes variants (BAM + VCF output). |
10 | | -- **Metagenomic** (`--meta`): Scores reads from a mixture sample against every node in the PanMAN, and uses the scoring information to estimate haplotype abundance or directly assign reads to nodes. |
11 | | - |
12 | | - |
13 | | -### Run with Docker |
| 7 | +## Install |
14 | 8 |
|
15 | 9 | ```bash |
16 | | -docker pull alanalohaucsc/panmap:latest |
17 | | -docker run --rm alanalohaucsc/panmap:latest |
18 | | -``` |
19 | | - |
20 | | -See the [documentation](https://amkram.github.io/panmap/) for building from source. |
21 | | - |
22 | | -## Usage |
23 | | - |
24 | | -``` |
25 | | -panmap <panman> [reads1.fq] [reads2.fq] [options] |
| 10 | +conda install -c bioconda panmap |
26 | 11 | ``` |
27 | 12 |
|
28 | | -### Pipeline stages |
29 | | - |
30 | | -By default, panmap runs through genotyping. Use `--stop` to control how far the pipeline runs: |
31 | | - |
32 | | -| Stage | Output | |
33 | | -|------------|-------------------------| |
34 | | -| `index` | `.idx` (seed index) | |
35 | | -| `place` | `.placement.tsv` | |
36 | | -| `align` | `.bam` | |
37 | | -| `genotype` | `.vcf` | |
38 | | - |
39 | | -### Key options |
40 | | - |
41 | | -``` |
42 | | --o, --output <prefix> Output file prefix |
43 | | --t, --threads <N> Number of threads (default: 1) |
44 | | --a, --aligner <str> minimap2 (default) or bwa |
45 | | ---stop <stage> Stop after: index, place, align, genotype |
46 | | ---meta Metagenomic mode |
47 | | --k, --kmer <19> Syncmer k |
48 | | --s, --syncmer <8> Syncmer s |
49 | | ---refine Alignment-based refinement of top candidates |
50 | | ---force-leaf Restrict placement to leaf nodes |
51 | | --v, --verbose Verbose output |
52 | | --q, --quiet Errors only |
53 | | -``` |
54 | | - |
55 | | -Run `panmap --help` for the full option list. |
56 | | - |
57 | | -### Example usage |
| 13 | +Or with Docker: |
58 | 14 |
|
59 | | -**Place and genotype paired-end reads:** |
60 | 15 | ```bash |
61 | | -panmap ref.panman reads_R1.fq reads_R2.fq --stop genotype -t 8 -o sample |
| 16 | +docker pull alanalohaucsc/panmap:latest |
62 | 17 | ``` |
63 | 18 |
|
64 | | -**Metagenomic mode, estimating SARS-CoV-2 lineage abundances:** |
65 | | - |
66 | | -First step is to build an index for metagenomics mode: |
| 19 | +## Quick start |
67 | 20 |
|
68 | 21 | ```bash |
69 | | -mkdir example_run && cd example_run |
70 | | - |
71 | | -panmap ../examples/data/sars_20000_twilight_dipper.panman \ |
72 | | - --index-mgsr sars_20000_twilight_dipper.idx |
73 | | -``` |
74 | | - |
75 | | -Then run panmap with the `--meta` option: |
| 22 | +# Place and genotype paired-end reads |
| 23 | +panmap ref.panman reads_R1.fq reads_R2.fq --stop genotype -t 8 -o sample |
76 | 24 |
|
77 | | -```bash |
78 | | -panmap ../examples/data/sars_20000_twilight_dipper.panman \ |
79 | | - ../examples/data/sars20000_5hap_0snp-a_200000_rep0_R1.fastq.gz \ |
80 | | - ../examples/data/sars20000_5hap_0snp-a_200000_rep0_R2.fastq.gz \ |
81 | | - --meta --index sars_20000_twilight_dipper.idx \ |
82 | | - --threads 8 --em-delta-threshold 0.00001 |
| 25 | +# Metagenomic abundance estimation |
| 26 | +panmap ref.panman reads.fq --meta --index ref.idx -t 8 -o sample |
83 | 27 | ``` |
84 | 28 |
|
85 | | -Reads used above were simulated shotgun-sequencing reads of SARS-CoV-2 mixtures. For wastewater samples, refer to README |
86 | | -in [examples/wastewater](examples/wastewater) for more details. |
| 29 | +## Pipeline |
87 | 30 |
|
88 | | -**Metagenomic mode, filter and assign reads:** |
89 | | - |
90 | | -We first build an index for the vertebrate mitochondrial PanMAN. We recommend using the `-k 15 -s 8 -l 1` seed parameters for aeDNA reads. |
91 | | - |
92 | | -```bash |
93 | | -mkdir example_run && cd example_run |
94 | | - |
95 | | -panmap ../examples/data/v_mtdna.panman \ |
96 | | - --index-mgsr v_mtdna.idx -k 15 -s 8 -l 1 |
97 | 31 | ``` |
98 | | - |
99 | | -Then run panmap with the `--filter-and-assign` option: |
100 | | - |
101 | | -```bash |
102 | | -panmap ../examples/data/v_mtdna.panman \ |
103 | | - ../examples/data/subsampled.fastq.gz \ |
104 | | - --meta -i v_mtdna.idx \ |
105 | | - --filter-and-assign --discard 0.6 --dust 5 \ |
106 | | - --taxonomic-metadata ../examples/data/v_mtdna.meta.tsv \ |
107 | | - -t 4 --breadth-ratio --output subsampled |
| 32 | +index --> place --> align --> genotype |
| 33 | + .idx .placement.tsv .bam .vcf |
108 | 34 | ``` |
109 | 35 |
|
110 | | -This outputs 3 files: |
111 | | - |
112 | | -`.mgsr.assignedReads.fastq` file containing the reads that were assigned |
| 36 | +By default, panmap stops after placement. Use `--stop` to run further stages. |
113 | 37 |
|
114 | | -`.mgsr.assignedReads.out` file containing the number of reads assigned to each node and the indices of the reads assigned, with respect to the the `.mgsr.assignedReads.fastq` file |
| 38 | +## Modes |
115 | 39 |
|
116 | | -`.mgsr.assignedReadsLCANode.out` file containing the number of reads assigned to the LCA node and the indices of the reads assigned. *As reads may be assigned to multiple nodes, the LCA node of a read is the LCA of all the nodes it was assigned to.* |
| 40 | +- **Single-sample** (default): Place reads, align to closest reference, call variants (BAM + VCF) |
| 41 | +- **Metagenomic** (`--meta`): Estimate haplotype abundance or assign reads to pangenome nodes |
117 | 42 |
|
118 | | -### Building from source |
| 43 | +## Links |
119 | 44 |
|
120 | | -See the [installation docs](https://amkram.github.io/panmap/installation/) for dependencies and build instructions. |
| 45 | +- [Full documentation](https://amkram.github.io/panmap/) |
| 46 | +- [Installation options](https://amkram.github.io/panmap/installation/) |
| 47 | +- [CLI reference](https://amkram.github.io/panmap/cli-reference/) |
| 48 | +- [PanMAN format](https://github.com/TurakhiaLab/panman) |
0 commit comments