Skip to content

Commit 7d7c981

Browse files
committed
Update readme
1 parent d5b4fb7 commit 7d7c981

File tree

1 file changed

+180
-0
lines changed

1 file changed

+180
-0
lines changed

README.md

Lines changed: 180 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,180 @@
1+
# PredIG Docker and Singularity Documentation
2+
3+
## Overview
4+
5+
PredIG is a tool for protein immunogenicity prediction that can predict from various input formats including UniProt IDs and FASTA sequences.
6+
7+
## Prerequisites
8+
9+
1. Docker/Singularity installed on your system
10+
2. UniProt database file (uniprot_sprot.fasta)
11+
3. The PredIG Docker image:
12+
13+
```bash
14+
docker pull bsceapm/predig:latest
15+
```
16+
17+
## UniProt Database Setup
18+
19+
1. Download the UniProt database file (uniprot_sprot.fasta) [Download here](https://ftp.ebi.ac.uk/pub/databases/uniprot/knowledgebase/uniprot_sprot.fasta.gz -O uniprot_sprot.fasta.gz)
20+
2. Place it in a directory that will be mounted to the container
21+
3. This directory must be bound to `/uniprot` when running the container
22+
23+
## Basic Usage with Docker
24+
25+
The container requires two volume bindings:
26+
27+
- Your working directory to `/predig` (for input/output files)
28+
- UniProt database directory to `/uniprot`
29+
30+
Basic command structure:
31+
32+
```bash
33+
docker run -v /path/to/work/dir:/predig -v /path/to/uniprot/dir:/uniprot bsceapm/predig <input_file> --output <output_file> [options]
34+
```
35+
36+
## Input Modes
37+
38+
### 1. UniProt Mode (Default)
39+
40+
Predict using UniProt.
41+
Example:
42+
43+
```bash
44+
docker run -v ./my_data:/predig -v ./uniprot:/uniprot \
45+
bsceapm/predig input_proteins.csv --output results.csv
46+
```
47+
48+
### 2. Recombinant Mode
49+
50+
Precit using recombinant sequences.
51+
Example:
52+
53+
```bash
54+
docker run -v ./my_data:/predig -v ./uniprot:/uniprot \
55+
bsceapm/predig input_sequences.csv --output results.csv --type recombinant
56+
```
57+
58+
### 3. FASTA Mode
59+
60+
Predict from FASTA input files. Requires an additional HLA alleles file.
61+
Example:
62+
63+
```bash
64+
docker run -v ./my_data:/predig -v ./uniprot:/uniprot \
65+
bsceapm/predig sequences.fasta --output results.csv --type fasta --alleles alleles.csv
66+
```
67+
68+
## Command Arguments
69+
70+
Required:
71+
72+
- Input file: Path to the input file (relative to mounted directory)
73+
- `--output`: Name of the output file
74+
75+
Optional:
76+
77+
- `--type`: Input file type (uniprot, fasta, or recombinant)
78+
- `--model`: Prediction model (noncan, neoant, or path)
79+
- `--alleles`: Path to HLA alleles file (required for FASTA mode)
80+
- `--alpha`: Alpha parameter value
81+
- `--precursor-length`: Length of precursor sequence
82+
83+
## Running with Singularity
84+
85+
Singularity can run Docker containers directly, making it easy to use PredIG in HPC environments where Docker might not be available.
86+
87+
### Converting Docker Image to Singularity
88+
89+
1. Pull the Docker image and convert it to Singularity format:
90+
91+
```bash
92+
singularity pull predig.sif docker://bsceapm/predig:latest
93+
```
94+
95+
### Basic Usage with Singularity
96+
97+
The command structure is similar to Docker, but uses Singularity bind syntax:
98+
99+
```bash
100+
singularity run --bind /path/to/work/dir:/predig,/path/to/uniprot/dir:/uniprot \
101+
predig.sif <input_file> --output <output_file> [options]
102+
```
103+
104+
### Singularity Examples
105+
106+
1. UniProt Mode:
107+
108+
```bash
109+
singularity run --bind ./my_data:/predig,./uniprot:/uniprot \
110+
predig.sif input_proteins.csv --output results.csv
111+
```
112+
113+
2. Recombinant Mode:
114+
115+
```bash
116+
singularity run --bind ./my_data:/predig,./uniprot:/uniprot \
117+
predig.sif input_sequences.csv --output results.csv --type recombinant
118+
```
119+
120+
3. FASTA Mode:
121+
122+
```bash
123+
singularity run --bind ./my_data:/predig,./uniprot:/uniprot \
124+
predig.sif sequences.fasta --output results.csv --type fasta --alleles alleles.csv
125+
```
126+
127+
### Singularity Notes
128+
129+
- Multiple bind paths are separated by commas in Singularity
130+
- The `.sif` file can be placed anywhere and called from any directory
131+
- All other functionality remains identical to the Docker version
132+
- File permissions are inherited from your user account, unlike Docker
133+
134+
## Example File Formats
135+
136+
### UniProt Mode Input Example
137+
138+
CSV file with UniProt IDs:
139+
140+
```
141+
UniProtID
142+
P01889
143+
P61769
144+
```
145+
146+
### Recombinant Mode Input Example
147+
148+
CSV file with protein sequences:
149+
150+
```
151+
Sequence
152+
MALTLSFFVVLLLVG
153+
MLPGLALLLLAAWTARA
154+
```
155+
156+
### FASTA Mode Input Example
157+
158+
FASTA file (sequences.fasta):
159+
160+
```
161+
>Protein1
162+
MALTLSFFVVLLLVG
163+
>Protein2
164+
MLPGLALLLLAAWTARA
165+
```
166+
167+
Alleles file (alleles.csv):
168+
169+
```
170+
Allele
171+
HLA-A*02:01
172+
HLA-B*07:02
173+
```
174+
175+
## Notes
176+
177+
- All input/output files must be in the directory mounted to `/predig`
178+
- The UniProt database file must be in the directory mounted to `/uniprot`
179+
- File paths in commands should be relative to the mounted directories
180+
- Output files will be created in your mounted working directory

0 commit comments

Comments
 (0)