Skip to content

Commit 01ebc9c

Browse files
committed
add markov chain based illumina
error simulation
1 parent 949de8e commit 01ebc9c

File tree

12 files changed

+9717
-21
lines changed

12 files changed

+9717
-21
lines changed

.gitignore

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,6 @@
11
*.pyc
2-
*.ipynb
32
make_plot/*
3+
.DS_Store
4+
r1.fq
5+
r2.fq
6+
metagenome.*

README.md

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -116,6 +116,11 @@ If `Pg >= Pu`, the base is substituted (fig 3).
116116

117117
ADRSM can simulate Illumina sequencing error with a uniform based model.
118118

119+
## Note on Illumina base quality score
120+
121+
The base quality score is generated using a Markov chain from fastq template files.
122+
123+
119124
## Note on mutation
120125

121126
ADRSM offers you to add [mutation](https://en.wikipedia.org/wiki/Mutation_rate) to your sequences. This allows to account for the evolutionary differences between ancient organisms and their reference genome counterparts present in today's databases.

adrsm

Lines changed: 2 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,6 @@
22

33
from numpy import random as npr
44
import lib.adrsmlib as ad
5-
65
import argparse
76

87

@@ -65,11 +64,6 @@ Homepage & Documentation: github.com/maxibor/adrsm
6564
dest="output",
6665
default="metagenome",
6766
help="Output file basename. Default = ./metagenome.*")
68-
parser.add_argument(
69-
'-q',
70-
dest="quality",
71-
default="d",
72-
help="Base quality encoding. Default = d (PHRED+64)")
7367
parser.add_argument(
7468
'-s',
7569
dest="stats",
@@ -98,12 +92,11 @@ Homepage & Documentation: github.com/maxibor/adrsm
9892
themin = args.min
9993
themax = args.max
10094
outfile = args.output
101-
quality = args.quality
10295
stats = args.stats
10396
seed = int(args.seed)
10497
threads = int(args.threads)
10598

106-
return(infile, readlen, nbinom, a1, a2, err, geom_p, themin, themax, outfile, quality, stats, seed, threads)
99+
return(infile, readlen, nbinom, a1, a2, err, geom_p, themin, themax, outfile, stats, seed, threads)
107100

108101

109102
def read_config(infile):
@@ -137,7 +130,7 @@ def read_config(infile):
137130

138131
if __name__ == "__main__":
139132
version = 0.8
140-
INFILE, READLEN, NBINOM, A1, A2, ERR, GEOM_P, THEMIN, THEMAX, OUTFILE, QUALITY, STATS, SEED, PROCESS = _get_args()
133+
INFILE, READLEN, NBINOM, A1, A2, ERR, GEOM_P, THEMIN, THEMAX, OUTFILE, STATS, SEED, PROCESS = _get_args()
141134

142135
MINLENGTH = 20
143136
npr.seed(SEED)
@@ -162,7 +155,6 @@ if __name__ == "__main__":
162155
THEMIN=THEMIN,
163156
THEMAX=THEMAX,
164157
fastq_dict=genome_dict,
165-
QUALITY=QUALITY,
166158
PROCESS=PROCESS)
167159
stat_dict[ad.get_basename(agenome)] = stat_and_run
168160

0 commit comments

Comments
 (0)