Skip to content

Commit a45763b

Browse files
authored
feat: add aria2c wrapper (#2725)
<!-- Ensure that the PR title follows conventional commit style (<type>: <description>)--> <!-- Possible types are here: https://github.com/commitizen/conventional-commit-types/blob/master/index.json --> <!-- Add a description of your PR here--> Add wrapper for aria2c, since it allows (among others): - download of several protocols (e.g. HTTP/HTTPS, FTP, SFTP, BitTorrent and Metalink) - parallel downloads - automated checksum check - pre-allocate disk space ### QC <!-- Make sure that you can tick the boxes below. --> * [x] I confirm that: For all wrappers added by this PR, * there is a test case which covers any introduced changes, * `input:` and `output:` file paths in the resulting rule can be changed arbitrarily, * either the wrapper can only use a single core, or the example rule contains a `threads: x` statement with `x` being a reasonable default, * rule names in the test case are in [snake_case](https://en.wikipedia.org/wiki/Snake_case) and somehow tell what the rule is about or match the tools purpose or name (e.g., `map_reads` for a step that maps reads), * all `environment.yaml` specifications follow [the respective best practices](https://stackoverflow.com/a/64594513/2352071), * the `environment.yaml` pinning has been updated by running `snakedeploy pin-conda-envs environment.yaml` on a linux machine, * wherever possible, command line arguments are inferred and set automatically (e.g. based on file extensions in `input:` or `output:`), * all fields of the example rules in the `Snakefile`s and their entries are explained via comments (`input:`/`output:`/`params:` etc.), * `stderr` and/or `stdout` are logged correctly (`log:`), depending on the wrapped tool, * temporary files are either written to a unique hidden folder in the working directory, or (better) stored where the Python function `tempfile.gettempdir()` points to (see [here](https://docs.python.org/3/library/tempfile.html#tempfile.gettempdir); this also means that using any Python `tempfile` default behavior works), * the `meta.yaml` contains a link to the documentation of the respective tool or command, * `Snakefile`s pass the linting (`snakemake --lint`), * `Snakefile`s are formatted with [snakefmt](https://github.com/snakemake/snakefmt), * Python wrapper scripts are formatted with [black](https://black.readthedocs.io). * Conda environments use a minimal amount of channels, in recommended ordering. E.g. for bioconda, use (conda-forge, bioconda, nodefaults, as conda-forge should have highest priority and defaults channels are usually not needed because most packages are in conda-forge nowadays). <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - **New Features** - Introduced a new wrapper for the aria2c download utility, supporting multiple checksum types (MD5, SHA-1, SHA-224, SHA-256, SHA-384, SHA-512, Adler32) for file integrity verification. - Added a comprehensive test suite for the aria2c wrapper, including sample checksum files and Snakemake rules to validate downloads with various hash algorithms. - Provided environment and metadata files to ensure reproducible setups and clear tool documentation. - **Tests** - Implemented automated tests to verify the aria2c wrapper's functionality and checksum verification. - **Chores** - Updated GitHub Actions workflow to include an additional storage plugin for Snakemake. <!-- end of auto-generated comment: release notes by coderabbit.ai -->
1 parent 63f5e87 commit a45763b

13 files changed

+309
-1
lines changed

.github/workflows/qc.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -52,7 +52,7 @@ jobs:
5252
shell: bash -el {0}
5353
run: |
5454
conda config --set channel_priority strict
55-
conda install -n snakemake -y snakemake-minimal snakemake
55+
conda install -n snakemake -y snakemake snakemake-minimal snakemake-storage-plugin-http
5656
5757
- name: Fetch master
5858
run: |

test_wrappers.py

Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -134,6 +134,27 @@ def _run(wrapper, cmd, check_log=None, compare_results_with_expected=None):
134134
return _run
135135

136136

137+
def test_aria2c(run):
138+
run(
139+
"utils/aria2c",
140+
[
141+
"snakemake",
142+
"--cores",
143+
"2",
144+
"--use-conda",
145+
"-F",
146+
"results/file.fas.gz",
147+
"results/file.md5.fas.gz",
148+
"results/file.md5file.fas.gz",
149+
"results/file.sha1file.fas.gz",
150+
"results/file.sha224file.fas.gz",
151+
"results/file.sha256file.fas.gz",
152+
"results/file.sha384file.fas.gz",
153+
"results/file.sha512file.fas.gz",
154+
"results/file.md5fileH.fas.gz",
155+
],
156+
)
157+
137158
def test_miller(run):
138159
run(
139160
"utils/miller",
Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,22 @@
1+
# This file may be used to create an environment using:
2+
# $ conda create --name <env> --file <this file>
3+
# platform: linux-64
4+
# created-by: conda 25.3.1
5+
@EXPLICIT
6+
https://conda.anaconda.org/conda-forge/linux-64/_libgcc_mutex-0.1-conda_forge.tar.bz2#d7c89558ba9fa0495403155b64376d81
7+
https://conda.anaconda.org/conda-forge/noarch/ca-certificates-2025.4.26-hbd8a1cb_0.conda#95db94f75ba080a22eb623590993167b
8+
https://conda.anaconda.org/conda-forge/linux-64/libgomp-14.2.0-h767d61c_2.conda#06d02030237f4d5b3d9a7e7d348fe3c6
9+
https://conda.anaconda.org/conda-forge/linux-64/_openmp_mutex-4.5-2_gnu.tar.bz2#73aaf86a425cc6e73fcf236a5a46396d
10+
https://conda.anaconda.org/conda-forge/linux-64/libgcc-14.2.0-h767d61c_2.conda#ef504d1acbd74b7cc6849ef8af47dd03
11+
https://conda.anaconda.org/conda-forge/linux-64/c-ares-1.34.5-hb9d3cd8_0.conda#f7f0d6cc2dc986d42ac2689ec88192be
12+
https://conda.anaconda.org/conda-forge/linux-64/libgcc-ng-14.2.0-h69a702a_2.conda#a2222a6ada71fb478682efe483ce0f92
13+
https://conda.anaconda.org/conda-forge/linux-64/libiconv-1.18-h4ce23a2_1.conda#e796ff8ddc598affdf7c173d6145f087
14+
https://conda.anaconda.org/conda-forge/linux-64/liblzma-5.8.1-hb9d3cd8_0.conda#0e87378639676987af32fee53ba32258
15+
https://conda.anaconda.org/conda-forge/linux-64/libstdcxx-14.2.0-h8f9b012_2.conda#a78c856b6dc6bf4ea8daeb9beaaa3fb0
16+
https://conda.anaconda.org/conda-forge/linux-64/libzlib-1.3.1-hb9d3cd8_2.conda#edb0dca6bc32e4f4789199455a1dbeb8
17+
https://conda.anaconda.org/conda-forge/linux-64/openssl-3.5.0-h7b32b05_0.conda#bb539841f2a3fde210f387d00ed4bb9d
18+
https://conda.anaconda.org/conda-forge/linux-64/libsqlite-3.49.1-hee588c1_2.conda#962d6ac93c30b1dfc54c9cccafd1003e
19+
https://conda.anaconda.org/conda-forge/linux-64/libssh2-1.11.1-hcf80075_0.conda#eecce068c7e4eddeb169591baac20ac4
20+
https://conda.anaconda.org/conda-forge/linux-64/libstdcxx-ng-14.2.0-h4852527_2.conda#c75da67f045c2627f59e6fcb5f4e3a9b
21+
https://conda.anaconda.org/conda-forge/linux-64/libxml2-2.13.7-h81593ed_1.conda#0619e8fc4c8025a908ea3a3422d3b775
22+
https://conda.anaconda.org/conda-forge/linux-64/aria2-1.37.0-hbc8128a_2.conda#03b8874fa70df577f3eee53085d025cf

utils/aria2c/environment.yaml

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
channels:
2+
- conda-forge
3+
- nodefaults
4+
dependencies:
5+
- aria2 =1.37.0

utils/aria2c/meta.yaml

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,15 @@
1+
name: aria2
2+
url: https://github.com/aria2/aria2/
3+
description: >
4+
aria2 is a lightweight multi-protocol & multi-source, cross platform download utility operated in command-line. It supports HTTP/HTTPS, FTP, SFTP, BitTorrent and Metalink.
5+
authors:
6+
- Filipe G. Vieira
7+
output:
8+
- Path to downloaded file
9+
params:
10+
- url: URL to download from
11+
- extra: Optional arguments for `aria2c`
12+
- type: type of hash, where `type in ["sha-1", "sha-224", "sha-256", "sha-384", "sha-512", "md5", "adler32"]`
13+
notes: |
14+
* Checksum input file only supported for single-file downloads
15+
* Requires `snakemake >=9.3.1`
Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,13 @@
1+
04c1275ff9c9d0fb595b7482a1d54438 ./annotation_hashes.txt
2+
3413f40db67f8ea3b3a193c2fd663a6e ./GCF_000869925.1_ViralProj17181_assembly_report.txt
3+
7d45362bb87770fac4716b60055fd72d ./GCF_000869925.1_ViralProj17181_assembly_stats.txt
4+
3e2e82ee2bd94c18d92891211eafdf18 ./GCF_000869925.1_ViralProj17181_cds_from_genomic.fna.gz
5+
e673fed3417f2f694b99f9cab1dad83e ./GCF_000869925.1_ViralProj17181_feature_count.txt
6+
c5a292890d71b35ddd4b2366d06cdeb6 ./GCF_000869925.1_ViralProj17181_feature_table.txt.gz
7+
42aa93c5bfdba6ac09a4822a4407b572 ./GCF_000869925.1_ViralProj17181_genomic.fna.gz
8+
a2e1b9686fcbdd4c4059c0ee4c03851a ./GCF_000869925.1_ViralProj17181_genomic.gbff.gz
9+
4276f72895f3436e6826424d1b908d20 ./GCF_000869925.1_ViralProj17181_genomic.gff.gz
10+
81499b53906a29cebea4e472e8ffe842 ./GCF_000869925.1_ViralProj17181_genomic.gtf.gz
11+
a3f486d02206a33e0d17f79d11807f0d ./GCF_000869925.1_ViralProj17181_protein.faa.gz
12+
7c30a6c03dbc7402ce0872afb0ec9e94 ./GCF_000869925.1_ViralProj17181_protein.gpff.gz
13+
cdbfa4db0d86580a730f0829b9ca2151 ./GCF_000869925.1_ViralProj17181_translated_cds.faa.gz
Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
30004da6fc9f681d59c6c92cc99c9331622fb1f5 GCF_000869925.1_ViralProj17181_genomic.fna.gz
Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
ac2d83823e2adc6b7b38e8dda0b7ff9c2536e62d96dec77e68cf0147 GCF_000869925.1_ViralProj17181_genomic.fna.gz
Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
337dad2a0047dde05c24d5ae83fe175f762212e2e50a9494e54f43f9ebd508bd GCF_000869925.1_ViralProj17181_genomic.fna.gz
Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
0171910ac0f8c881e24ac5054c734eb295fe73c3a6ad0857eab9349446949a96c45095241ae8d63f25c16a4c1e37c30a GCF_000869925.1_ViralProj17181_genomic.fna.gz

0 commit comments

Comments
 (0)