Skip to content

Commit b3000fc

Browse files
committed
Small bug fixes and text corrections
2 parents 6ef0228 + b6c6404 commit b3000fc

File tree

10 files changed

+483
-293
lines changed

10 files changed

+483
-293
lines changed

README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -106,7 +106,7 @@ If you use anaconda you can install Samtools with
106106

107107
conda install -c bioconda samtools openssl=1.0
108108

109-
The ST Pipeline recommends a computer with at least 32GB of RAM (depending on the size of the genome) and 8 cpu cores.
109+
The ST Pipeline needs a computer with at least 32GB of RAM (depending on the size of the genome) and 8 cpu cores.
110110

111111
**Dependencies**
112112

README_SHORT

Lines changed: 1 addition & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -53,6 +53,4 @@ Basically what the ST pipeline does is:
5353
You can see a graphical more detailed description of the workflow in the documents workflow.pdf and workflow_extended.pdf
5454

5555
The output will be a matrix of counts (genes as columns, spots as rows),
56-
a BED file containing the transcripts (Read name, coordinate, gene, etc..), and a JSON
57-
file with useful stats.
58-
The ST pipeline will also output a log file with useful information.
56+
The ST pipeline will also output a log file with useful information and stats.

docsrc/changes.rst

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,11 @@
11
Changes
22
-------
33

4+
**Version 1.8.1**
5+
6+
* Fixed a bug when having barcodes after the UMI
7+
* Improved descriptions for parameters
8+
49
**Version 1.8.0**
510

611
* Improved the unit-tests

docsrc/example.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -43,7 +43,7 @@ If you want to process Visium datasets it is recommended to use these settings
4343
.. code-block:: bash
4444
4545
--allowed-missed 1 \
46-
--allowed-kmer 4 \
46+
--allowed-kmer 4 \
4747
--umi-allowed-mismatches 2 \
4848
--umi-start-position 16 \
4949
--umi-end-position 28 \

docsrc/intro.rst

Lines changed: 8 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -39,12 +39,12 @@ The input FASTQ files can be given in gzip/bzip format as well.
3939
Basically what the ST pipeline does is:
4040

4141
- Quality trimming (read 1 and read 2):
42-
- Remove low quality bases
43-
- Sanity check (reads same length, reads order, etc..)
44-
- Check quality UMI (if provided)
45-
- Remove artifacts (PolyT, PolyA, PolyG, PolyN and PolyC) of user defined length
46-
- Check for AT and GC content
47-
- Discard reads with a minimum number of bases of that failed any of the checks above
42+
- Remove low quality bases
43+
- Sanity check (reads same length, reads order, etc..)
44+
- Check quality UMI (if provided)
45+
- Remove artifacts (PolyT, PolyA, PolyG, PolyN and PolyC) of user defined length
46+
- Check for AT and GC content
47+
- Discard reads with a minimum number of bases of that failed any of the checks above
4848
- Contamimant filter e.x. rRNA genome (Optional)
4949
- Mapping with STAR (only read 2)
5050
- Demultiplexing with [Taggd](https://github.com/SpatialTranscriptomicsResearch/taggd) (only read 1)
@@ -55,7 +55,5 @@ Basically what the ST pipeline does is:
5555

5656
You can see a graphical more detailed description of the workflow in the documents workflow.pdf and workflow_extended.pdf
5757

58-
The output will be a matrix of counts (genes as columns, spots as rows),
59-
a BED file containing the transcripts (Read name, coordinate, gene, etc..), and a JSON
60-
file with useful stats.
61-
The ST pipeline will also output a log file with useful information.
58+
The output will be a matrix of counts (genes as columns, spots as rows)
59+
and a log file with useful information and stats.

docsrc/manual.rst

Lines changed: 129 additions & 117 deletions
Large diffs are not rendered by default.

scripts/st_qa.py

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -120,16 +120,16 @@ def main(input_data):
120120
plt.clf()
121121

122122
# Generate density plots
123-
sns.displot(aggregated_gene_counts, hist=False, label="Counts > 0")
124-
sns.displot(aggregated_gene_counts_1, hist=False, label="Counts > 1")
123+
sns.distplot(aggregated_gene_counts, hist=False, label="Counts > 0")
124+
sns.distplot(aggregated_gene_counts_1, hist=False, label="Counts > 1")
125125
sns_plot = sns.distplot(aggregated_gene_counts_2,
126126
axlabel="#Genes", hist=False, label="Counts > 2")
127127
fig = sns_plot.get_figure()
128128
fig.savefig(input_name + "_density_genes_by_spot.pdf")
129129
plt.clf()
130130

131-
sns.displot(aggregated_gene_gene_counts, hist=False, label="Counts > 0")
132-
sns.displot(aggregated_gene_gene_counts_1, hist=False, label="Counts > 1")
131+
sns.distplot(aggregated_gene_gene_counts, hist=False, label="Counts > 0")
132+
sns.distplot(aggregated_gene_gene_counts_1, hist=False, label="Counts > 1")
133133
sns_plot = sns.distplot(aggregated_gene_gene_counts_2,
134134
axlabel="#Spots", hist=False, label="Counts > 2")
135135
fig = sns_plot.get_figure()

stpipeline/core/mapping.py

Lines changed: 3 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -205,7 +205,6 @@ def barcodeDemultiplexing(reads,
205205
idFile,
206206
mismatches,
207207
kmer,
208-
start_positon,
209208
over_hang,
210209
taggd_metric,
211210
taggd_multiple_hits_keep_one,
@@ -223,7 +222,6 @@ def barcodeDemultiplexing(reads,
223222
:param idFile: a tab delimited file (BARCODE - X - Y) containing all the barcodes
224223
:param mismatches: the number of allowed mismatches
225224
:param kmer: the kmer length
226-
:param start_positon: the start position of the barcode
227225
:param over_hang: the number of bases to allow for overhang
228226
:param taggd_metric: the distance metric algorithm (Subglobal, Levensthein or Hamming)
229227
:param taggd_multiple_hits_keep_one: when True keep one random hit when multiple candidates
@@ -234,7 +232,6 @@ def barcodeDemultiplexing(reads,
234232
:type idFile: str
235233
:type mismatches: int
236234
:type kmer: int
237-
:type start_positon: int
238235
:type over_hang: int
239236
:type taggd_metric: str
240237
:type taggd_multiple_hits_keep_one: bool
@@ -271,13 +268,12 @@ def barcodeDemultiplexing(reads,
271268

272269
args += ["--max-edit-distance", mismatches,
273270
"--k", kmer,
274-
"--barcode-tag", "B0", # if input is BAM we tell taggd what tag contains the barcode
275-
"--start-position", start_positon,
271+
"--barcode-tag", "B0", # if input is BAM we tell taggd which tag contains the barcode
276272
"--homopolymer-filter", 0,
277273
"--subprocesses", cores,
278274
"--metric", taggd_metric,
279-
"--overhang", over_hang] # ,
280-
# '--use-samtools-merge'] # Could be added to merge using samtools instead of pysam WIP on taggd
275+
"--overhang", over_hang]
276+
# --use-samtools-merge Could be added to merge using samtools instead of pysam WIP on taggd
281277

282278
if taggd_multiple_hits_keep_one:
283279
args.append("--multiple-hits-keep-one")

0 commit comments

Comments
 (0)