Skip to content

Commit ae85e50

Browse files
committed
v4.1.3
- Fix: when the paired end files are compressed, read chunks did not resize, which led to excessive copy, and copy number might accumulate round by round.
1 parent 283c838 commit ae85e50

File tree

4 files changed

+13
-27
lines changed

4 files changed

+13
-27
lines changed

CHANGELOG.md

Lines changed: 4 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,10 @@
55
- Feature: multiple primer trimming.
66
- Feature: UMI trimming.
77

8+
## v4.1.3
9+
10+
- Fix: when the paired end files are compressed, read chunks did not resize, which led to excessive copy, and copy number might accumulate round by round.
11+
812
## v4.1.2
913

1014
- Fix: do not throw error if input paired end files are empty when doing `--detect-adapter`.
@@ -84,13 +88,11 @@
8488
## v3.1.0
8589

8690
- New feature: `--detect-adapter` for adapter determination.
87-
8891
- Fix: when input is an empty compressed fastq, atria exits with error because `read_chunks!(::IO, ...)` should return 4 elements, but returned 2.
8992

9093
## v3.0.3
9194

9295
- Fix v3.0.2: `will_eof` should be true when unknown.
93-
9496
- Do not resize chunk sizes before cycle 1 when inputs are compressed and cannot determine uncompressed sizes. Just assume data are not trimmed before.
9597

9698
## v3.0.2
@@ -100,9 +102,7 @@
100102
## v3.0.1
101103

102104
- Avoid to lock `IOStream` when write fastq in thread_output.jl: replace `write(::IOStream, ...)` with `write_no_lock(::IOStream, ...)`. It is slightly faster.
103-
104105
- Speed optimization for consensus calling: overwrite `BioSequences.complement(::DNA)` (1.40X), and define `iscomplement(::DNA, ::DNA)` (1.79X).
105-
106106
- Other minor parallel implementations.
107107

108108
## v3.0.0
@@ -112,7 +112,6 @@
112112
## v2.1.2
113113

114114
- Parameter optimization using `atria simulate`: --trim-score-pe 19->10, --tail-length 8->12.
115-
116115
- Development of Atria simulation methods.
117116

118117
## v2.1.1
@@ -126,17 +125,11 @@
126125
## v2.0.0
127126

128127
- Supporting low-complexity filtration.
129-
130128
- Supporting polyX tail trimming.
131-
132129
- Supporting single-end fastq.
133-
134130
- Supporting bzip2 compression/decompression.
135-
136131
- Supporting non standardized gzip compression files.
137-
138132
- Optimizing default parameters. (r1-r2-diff 0->0, trim-score-pe 8->10, score-diff removed, kmer-n-match 8->9)
139-
140133
- Robustness optimization: the lower bound of match probability is set to 0.75 because match probability lower than 0.75 is outlier and affect trim score strongly.
141134

142135
## v1.1.1
@@ -146,19 +139,14 @@
146139
## v1.1.0
147140

148141
- Performance optimization: adapter and PE trimming: if no adapters were matched, the number of errors of PE match is loosen.
149-
150142
- Performance optimization: consensus calling: new arg `--kmer-tolerance-consensus 2->10`; optimized arg `--min-ratio-mismatch 0.2->0.28`.
151-
152143
- Speed optimization: check `overlap_score > 0` before computing score (`pe_consensus!`).
153144

154145
## v1.0.3
155146

156147
- More detailed error output when encoding a non-nucleotide character (`throw_encode_error(...)`).
157-
158148
- Following symbolic link before checking file size for non-Windows platforms (`check_filesize(::String)`).
159-
160149
- When run in multi-file parallel mode, write stdout and stderr to a 'stdlog' file (`julia_wrapper_atria(...)`).
161-
162150
- Add option `--check-identifier` to check whether the identifiers of r1 and r2 are the same.
163151

164152
## v1.0.2

Project.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
name = "Atria"
22
uuid = "226cbef3-b485-431c-85c2-d8bd8da14025"
33
authors = ["Jiacheng Chuan <jiacheng_chuan@outlook.com>"]
4-
version = "4.1.2"
4+
version = "4.1.3"
55

66
[deps]
77
ArgParse = "c7e460c6-2fb9-53a9-8c5b-16f535851c63"

docs/1.2.Install_from_source.md

Lines changed: 5 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22

33
## Install from source
44

5-
Atria is tested in [Julia Language](https://julialang.org/) v1.8 and v1.9.
5+
Atria is tested in [Julia Language](https://julialang.org/) v1.8 and v1.9.
66

77
It is recommended to build Atria using Julia v1.8.5 because it is 3-20% faster than v1.9.
88

@@ -36,8 +36,7 @@ brew install pigz
3636
brew install pbzip2
3737
```
3838

39-
> If you do not use Homebrew, you can also download them from [pigz's official site](https://zlib.net/pigz/) and [pbzip2](https://pkgs.org/download/pbzip2).
40-
39+
> If you do not use Homebrew, you can also download them from [pigz&#39;s official site](https://zlib.net/pigz/) and [pbzip2](https://pkgs.org/download/pbzip2).
4140
4241
#### Atria
4342

@@ -81,10 +80,10 @@ juliaup default 1.8
8180

8281
Then, download `pigz` and `pbzip2` (a compression/decompression software used in Atria).
8382

84-
If you use `apt` package manager (Ubuntu/Debian), try `sudo apt install pigz pbzip2`.
85-
If you use `yum` package manager (CentOS), try `sudo yum install pigz pbzip2`.
83+
If you use `apt` package manager (Ubuntu/Debian), try `sudo apt install pigz pbzip2`.
84+
If you use `yum` package manager (CentOS), try `sudo yum install pigz pbzip2`.
8685

87-
You can also download them from [pigz's official site](https://zlib.net/pigz/) and [pbzip2](https://pkgs.org/download/pbzip2).
86+
You can also download them from [pigz&#39;s official site](https://zlib.net/pigz/) and [pbzip2](https://pkgs.org/download/pbzip2).
8887

8988
#### Atria
9089

src/Trimmer/wrapper_pe.jl

Lines changed: 3 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -582,14 +582,11 @@ function julia_wrapper_atria_pe(ARGS::Vector{String}; exit_after_help = true)
582582
n_r1_before = length(r1s) - n_reads
583583
n_r2_before = length(r2s) - n_reads
584584

585-
if typeof(io1) <: IOStream # not compressed
585+
if typeof(io1) <: IOStream # not compressed
586586
length(in1bytes) == chunk_size1 || resize!(in1bytes, chunk_size1)
587587
length(in2bytes) == chunk_size2 || resize!(in2bytes, chunk_size2)
588588
(n_r1, n_r2, r1s, r2s, ncopied) = load_fqs_threads!(io1, io2, in1bytes, in2bytes, vr1s, vr2s, r1s, r2s, task_r1s_unbox, task_r2s_unbox; remove_first_n = n_reads, njobs = njobs, quality_offset = quality_offset)
589589

590-
# it only get the sizes, did not change the sizes. Size changing is done in the "Read" part.
591-
chunk_size1, chunk_size2 = get_ideal_inbyte_sizes(in1bytes, in2bytes, n_r1, n_r2, n_r1_before, n_r2_before, max_chunk_size, chunk_size1, chunk_size2)
592-
593590
else # gziped
594591
total_n_bytes_read1 += length(in1bytes) # will read INT in this batch
595592
total_n_bytes_read2 += length(in2bytes) # will read INT in this batch
@@ -606,6 +603,8 @@ function julia_wrapper_atria_pe(ARGS::Vector{String}; exit_after_help = true)
606603
njobs = njobs
607604
);
608605
end
606+
# it only get the sizes, did not change the sizes. Size changing is done in the "Read" part.
607+
chunk_size1, chunk_size2 = get_ideal_inbyte_sizes(in1bytes, in2bytes, n_r1, n_r2, n_r1_before, n_r2_before, max_chunk_size, chunk_size1, chunk_size2)
609608

610609
n_reads = min(n_r1, n_r2)
611610
total_read_copied_in_loading += ncopied

0 commit comments

Comments
 (0)