-
Notifications
You must be signed in to change notification settings - Fork 202
feat: add pytrf sub-commands as individual Snakemake-Wrappers #4745
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
rohan-ibn-tariq
wants to merge
31
commits into
snakemake:master
Choose a base branch
from
rohan-ibn-tariq:feat/add-pytrf
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
Show all changes
31 commits
Select commit
Hold shift + click to select a range
b89759c
feat/add_pytrf: add basic working code without complete wrapper just …
rohan-ibn-tariq f01dfd2
feat/add_pytrf: delete unified wrapper approach
rohan-ibn-tariq 4e1e6bb
feat/add_pytrf: add findstr basic wrapper
rohan-ibn-tariq 70294eb
feat/add_pytrf: add findgtr basic wrapper
rohan-ibn-tariq c2260fa
feat/add_pytrf: add in meta discalaimer note
rohan-ibn-tariq cfefde7
feat/add_pytrf: fix output docs
rohan-ibn-tariq ba3ca6f
feat/add_pytrf: add end line
rohan-ibn-tariq dcbefdf
feat/add_pytrf: add pytrf subcommand findatr
rohan-ibn-tariq 9daf5f1
feat/add_pytrf: black fmt for test_wrappers.py and basic pytrf tests …
rohan-ibn-tariq e4b5aa0
feat/add_pytrf: update with expected results test info and doc-comments
rohan-ibn-tariq 4c87765
feat/add_pytrf: update with expected results test info and defaults test
rohan-ibn-tariq 6e357ed
feat/add_pytrf: finalize findgtr with expected results very basic min…
rohan-ibn-tariq fd0b7f4
feat/add_pytrf: add comparison for findgtr minimal
rohan-ibn-tariq 0dd30f6
feat/add_pytrf: refactor doc
rohan-ibn-tariq 0fe268f
feat/add_pytrf: refactor doc
rohan-ibn-tariq 86adfd6
feat/add_pytrf: add expected test for findatr + doc refactor
rohan-ibn-tariq 653e9c5
feat/add_pytrf: remove python pins not required
rohan-ibn-tariq 5b505fe
feat/add_pytrf: fix extract test
rohan-ibn-tariq 55b31be
feat/add_pytrf: fix url and add additional note
rohan-ibn-tariq 6cc90bd
feat/add_pytrf: black fmt wrapper.py
rohan-ibn-tariq 8c2eb7d
feat/add_pytrf: snakefile fmt findatr findstr
rohan-ibn-tariq 59e569f
feat/add_pytrf: pylint fixes for pytrf findstr
rohan-ibn-tariq 8bbe890
feat/add_pytrf: pylint fixes for pytrf findgtr
rohan-ibn-tariq b29a28a
feat/add_pytrf: pylint fixes for pytrf findatr
rohan-ibn-tariq 19530be
feat/add_pytrf: add extract but test failing
rohan-ibn-tariq 718846c
feat/add_pytrf: add extract command issue in pytest skip and meta.yaml
rohan-ibn-tariq 7b43c71
feat/add_pytrf: refactor meta.yaml's of 4 commands
rohan-ibn-tariq 04b9263
feat/add_pytrf: refactor meta.yaml for findatr
rohan-ibn-tariq 75421c4
feat/add_pytrf: pin envoirnments for four subcommands
rohan-ibn-tariq e93f085
feat/add-pytrf: merge branch master
rohan-ibn-tariq c0f2aaf
feat/add-pytrf: refactor meta.yaml
rohan-ibn-tariq File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,37 @@ | ||
| # This file may be used to create an environment using: | ||
| # $ conda create --name <env> --file <this file> | ||
| # platform: linux-64 | ||
| # created-by: conda 25.11.0 | ||
| @EXPLICIT | ||
| https://conda.anaconda.org/conda-forge/linux-64/_libgcc_mutex-0.1-conda_forge.tar.bz2#d7c89558ba9fa0495403155b64376d81 | ||
| https://conda.anaconda.org/conda-forge/noarch/ca-certificates-2025.11.12-hbd8a1cb_0.conda#f0991f0f84902f6b6009b4d2350a83aa | ||
| https://conda.anaconda.org/conda-forge/linux-64/libgomp-15.2.0-he0feb66_14.conda#91349c276f84f590487e4c7f6e90e077 | ||
| https://conda.anaconda.org/conda-forge/noarch/python_abi-3.12-8_cp312.conda#c3efd25ac4d74b1584d2f7a57195ddf1 | ||
| https://conda.anaconda.org/conda-forge/noarch/tzdata-2025b-h78e105d_0.conda#4222072737ccff51314b5ece9c7d6f5a | ||
| https://conda.anaconda.org/conda-forge/linux-64/_openmp_mutex-4.5-2_gnu.tar.bz2#73aaf86a425cc6e73fcf236a5a46396d | ||
| https://conda.anaconda.org/conda-forge/linux-64/libgcc-15.2.0-he0feb66_14.conda#550dceb769d23bcf0e2f97fd4062d720 | ||
| https://conda.anaconda.org/conda-forge/linux-64/bzip2-1.0.8-hda65f42_8.conda#51a19bba1b8ebfb60df25cde030b7ebc | ||
| https://conda.anaconda.org/conda-forge/linux-64/libexpat-2.7.3-hecca717_0.conda#8b09ae86839581147ef2e5c5e229d164 | ||
| https://conda.anaconda.org/conda-forge/linux-64/libffi-3.5.2-h9ec8514_0.conda#35f29eec58405aaf55e01cb470d8c26a | ||
| https://conda.anaconda.org/conda-forge/linux-64/libgcc-ng-15.2.0-h69a702a_14.conda#6c13aaae36d7514f28bd5544da1a7bb8 | ||
| https://conda.anaconda.org/conda-forge/linux-64/liblzma-5.8.1-hb9d3cd8_2.conda#1a580f7796c7bf6393fddb8bbbde58dc | ||
| https://conda.anaconda.org/conda-forge/linux-64/libnsl-2.0.1-hb9d3cd8_1.conda#d864d34357c3b65a4b731f78c0801dc4 | ||
| https://conda.anaconda.org/conda-forge/linux-64/libstdcxx-15.2.0-h934c35e_14.conda#8e96fe9b17d5871b5cf9d312cab832f6 | ||
| https://conda.anaconda.org/conda-forge/linux-64/libuuid-2.41.2-he9a06e4_0.conda#80c07c68d2f6870250959dcc95b209d1 | ||
| https://conda.anaconda.org/conda-forge/linux-64/libzlib-1.3.1-hb9d3cd8_2.conda#edb0dca6bc32e4f4789199455a1dbeb8 | ||
| https://conda.anaconda.org/conda-forge/linux-64/ncurses-6.5-h2d0b736_3.conda#47e340acb35de30501a76c7c799c41d7 | ||
| https://conda.anaconda.org/conda-forge/linux-64/openssl-3.6.0-h26f9b46_0.conda#9ee58d5c534af06558933af3c845a780 | ||
| https://conda.anaconda.org/conda-forge/linux-64/libstdcxx-ng-15.2.0-hdf11a46_14.conda#9531f671a13eec0597941fa19e489b96 | ||
| https://conda.anaconda.org/conda-forge/linux-64/libxcrypt-4.4.36-hd590300_1.conda#5aa797f8787fe7a17d1b0821485b5adc | ||
| https://conda.anaconda.org/conda-forge/linux-64/readline-8.2-h8c095d6_2.conda#283b96675859b20a825f8fa30f311446 | ||
| https://conda.anaconda.org/conda-forge/linux-64/tk-8.6.13-noxft_ha0e22de_103.conda#86bc20552bf46075e3d92b67f089172d | ||
| https://conda.anaconda.org/conda-forge/linux-64/zstd-1.5.7-hb8e6e7a_2.conda#6432cb5d4ac0046c3ac0a8a0f95842f9 | ||
| https://conda.anaconda.org/conda-forge/linux-64/icu-75.1-he02047a_0.conda#8b189310083baabfb622af68fd9d3ae3 | ||
| https://conda.anaconda.org/conda-forge/linux-64/ld_impl_linux-64-2.45-default_hbd61a6d_104.conda#a6abd2796fc332536735f68ba23f7901 | ||
| https://conda.anaconda.org/conda-forge/linux-64/libsqlite-3.51.0-hee844dc_0.conda#729a572a3ebb8c43933b30edcc628ceb | ||
| https://conda.anaconda.org/conda-forge/linux-64/python-3.12.12-hd63d673_1_cpython.conda#5c00c8cea14ee8d02941cab9121dce41 | ||
| https://conda.anaconda.org/bioconda/linux-64/pyfastx-2.2.0-py312h4711d71_1.tar.bz2#0c029565f5abbf1c3349a4abc0b4c63c | ||
| https://conda.anaconda.org/bioconda/linux-64/pytrf-1.4.2-py312h0fa9677_0.tar.bz2#11c47fcb88ad7fe0ab94dcf11b8bebb9 | ||
| https://conda.anaconda.org/conda-forge/noarch/setuptools-80.9.0-pyhff2d567_0.conda#4de79c071274a53dcaf2a8c749d1499e | ||
| https://conda.anaconda.org/conda-forge/noarch/wheel-0.45.1-pyhd8ed1ab_1.conda#75cb7132eb58d97896e173ef12ac9986 | ||
| https://conda.anaconda.org/conda-forge/noarch/pip-25.3-pyh8b19718_0.conda#c55515ca43c6444d2572e0f0d93cb6b9 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,8 @@ | ||
| channels: | ||
| - conda-forge | ||
| - bioconda | ||
| - nodefaults | ||
|
|
||
| dependencies: | ||
| - pytrf =1.4 | ||
| - pyfastx =2.2 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,28 @@ | ||
| name: pytrf extract (NOT WORKING, see notes below) | ||
| description: > | ||
| Extract tandem repeat sequences with flanking regions from DNA sequences. | ||
| Requires output from pytrf findstr, findgtr, or findatr as input. | ||
| url: https://pytrf.readthedocs.io/en/latest/usage.html#commandline-interface | ||
| authors: | ||
| - Muhammad Rohan Ali Asmat | ||
| input: | ||
| - FASTA or FASTQ file (supports gzip compression) | ||
| output: | ||
| - Output file (default -> stdout, will be redirected to the log file). | ||
| params: | ||
| repeat_file: > | ||
| **Required.** Path to TSV or CSV file from pytrf findstr/findgtr/findatr. | ||
| out_format: > | ||
| Output format. Options: 'tsv' (default), 'csv', or 'fasta'. | ||
| Note: Only extract command supports FASTA output. | ||
| flank_length: > | ||
| Length of flanking sequence (default: 100). | ||
| notes: > | ||
| **Bioconda package:** https://bioconda.github.io/recipes/pytrf/README.html |nl| | ||
| **GitHub repository:** https://github.com/lmdu/pytrf |nl| | ||
| **License:** MIT License |nl| | ||
| **Disclaimer:** This is a minimal implementation supporting basic functionality. | ||
| pytrf is not a Python binding to TRF - it's an independent tool. |nl| | ||
| **Known issue:** PyTRF 1.4.2 has a bug in the `extract` command (delimiter error). |nl| | ||
| See: https://github.com/lmdu/pytrf/issues/6 |nl| | ||
| This wrapper skips extract tests until upstream patch is released. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,17 @@ | ||
| # SAMPLE RULE: Extract tandem repeat sequences with flanking regions | ||
| # The pytrf extract wrapper requires output from findstr, findgtr, or findatr. | ||
| # | ||
| # Output: | ||
| # - If output file is specified, results are written to that file | ||
| # - If output is omitted, pytrf writes to stdout (redirected to log file) | ||
| rule pytrf_extract: | ||
| input: | ||
| "demo_data/{sample}.fasta", | ||
| output: | ||
| "results/{sample}_extract.tsv", | ||
| params: | ||
| repeat_file="demo_data/{sample}.tsv", | ||
| log: | ||
| "logs/{sample}.log", | ||
| wrapper: | ||
| "master/bio/pytrf/extract" |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,6 @@ | ||
| >seq1 | ||
| TCATCGGTCATCGGTCATCGGTCATCGGTCATCGG | ||
| >seq2 | ||
| ACCCCTCAGGGTACCCCTCAGGGTACCCCTCAGGGTACCCCTCAGGGTACCCCTCAGGGTACCCCTCAGGGTACCCCTCAGGGT | ||
| >seq3 | ||
| TGACTATATCCGCAAATGAAGGCTGTTCTCTGACATGACTATATCCGCAAATGAAGGCTGTTCTCTGACATGACTATATCCGCAAATGAAGGCTGTTCTCTGACATGACTATATCCGCAAATGAAGGCTGTTCTCTGACA |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,85 @@ | ||
| seq1 1 3 TCA 3 1 3 | ||
| seq1 4 6 TCG 3 1 3 | ||
| seq1 7 9 GTC 3 1 3 | ||
| seq1 10 12 ATC 3 1 3 | ||
| seq1 13 15 GGT 3 1 3 | ||
| seq1 16 18 CAT 3 1 3 | ||
| seq1 19 21 CGG 3 1 3 | ||
| seq1 22 24 TCA 3 1 3 | ||
| seq1 25 27 TCG 3 1 3 | ||
| seq1 28 30 GTC 3 1 3 | ||
| seq1 31 33 ATC 3 1 3 | ||
| seq1 34 36 GG 3 1 3 | ||
| seq2 1 3 ACC 3 1 3 | ||
| seq2 4 6 CCT 3 1 3 | ||
| seq2 7 9 CAG 3 1 3 | ||
| seq2 10 12 GGT 3 1 3 | ||
| seq2 13 15 ACC 3 1 3 | ||
| seq2 16 18 CCT 3 1 3 | ||
| seq2 19 21 CAG 3 1 3 | ||
| seq2 22 24 GGT 3 1 3 | ||
| seq2 25 27 ACC 3 1 3 | ||
| seq2 28 30 CCT 3 1 3 | ||
| seq2 31 33 CAG 3 1 3 | ||
| seq2 34 36 GGT 3 1 3 | ||
| seq2 37 39 ACC 3 1 3 | ||
| seq2 40 42 CCT 3 1 3 | ||
| seq2 43 45 CAG 3 1 3 | ||
| seq2 46 48 GGT 3 1 3 | ||
| seq2 49 51 ACC 3 1 3 | ||
| seq2 52 54 CCT 3 1 3 | ||
| seq2 55 57 CAG 3 1 3 | ||
| seq2 58 60 GGT 3 1 3 | ||
| seq2 61 63 ACC 3 1 3 | ||
| seq2 64 66 CCT 3 1 3 | ||
| seq2 67 69 CAG 3 1 3 | ||
| seq2 70 72 GGT 3 1 3 | ||
| seq2 73 75 ACC 3 1 3 | ||
| seq2 76 78 CCT 3 1 3 | ||
| seq2 79 81 CAG 3 1 3 | ||
| seq2 82 84 GGT 3 1 3 | ||
| seq3 1 3 TGA 3 1 3 | ||
| seq3 4 6 CTA 3 1 3 | ||
| seq3 7 9 TAT 3 1 3 | ||
| seq3 10 12 CCG 3 1 3 | ||
| seq3 13 15 CAA 3 1 3 | ||
| seq3 16 18 ATG 3 1 3 | ||
| seq3 19 21 AAG 3 1 3 | ||
| seq3 22 24 GCT 3 1 3 | ||
| seq3 25 27 GTT 3 1 3 | ||
| seq3 28 31 CT 2 2 4 | ||
| seq3 32 34 GAC 3 1 3 | ||
| seq3 35 37 ATG 3 1 3 | ||
| seq3 38 40 ACT 3 1 3 | ||
| seq3 41 44 AT 2 2 4 | ||
| seq3 45 47 CCG 3 1 3 | ||
| seq3 48 50 CAA 3 1 3 | ||
| seq3 51 53 ATG 3 1 3 | ||
| seq3 54 56 AAG 3 1 3 | ||
| seq3 57 59 GCT 3 1 3 | ||
| seq3 60 62 GTT 3 1 3 | ||
| seq3 63 66 CT 2 2 4 | ||
| seq3 67 69 GAC 3 1 3 | ||
| seq3 70 72 ATG 3 1 3 | ||
| seq3 73 75 ACT 3 1 3 | ||
| seq3 76 79 AT 2 2 4 | ||
| seq3 80 82 CCG 3 1 3 | ||
| seq3 83 85 CAA 3 1 3 | ||
| seq3 86 88 ATG 3 1 3 | ||
| seq3 89 91 AAG 3 1 3 | ||
| seq3 92 94 GCT 3 1 3 | ||
| seq3 95 97 GTT 3 1 3 | ||
| seq3 98 101 CT 2 2 4 | ||
| seq3 102 104 GAC 3 1 3 | ||
| seq3 105 107 ATG 3 1 3 | ||
| seq3 108 110 ACT 3 1 3 | ||
| seq3 111 114 AT 2 2 4 | ||
| seq3 115 117 CCG 3 1 3 | ||
| seq3 118 120 CAA 3 1 3 | ||
| seq3 121 123 ATG 3 1 3 | ||
| seq3 124 126 AAG 3 1 3 | ||
| seq3 127 129 GCT 3 1 3 | ||
| seq3 130 132 GTT 3 1 3 | ||
| seq3 133 136 CT 2 2 4 | ||
| seq3 137 139 GAC 3 1 3 | ||
| seq3 140 142 A 3 1 3 |
Empty file.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,55 @@ | ||
| """ | ||
| Snakemake Wrapper for PyTRF extract | ||
| ------------------------------------------------------ | ||
| Extract tandem repeat sequences with flanking regions. | ||
| """ | ||
|
|
||
| from pathlib import Path | ||
| from snakemake.shell import shell | ||
|
|
||
| # Logging | ||
| log = snakemake.log_fmt_shell(stdout=True, stderr=True) | ||
|
|
||
| # Get input file | ||
| try: | ||
| input_file = Path(snakemake.input[0]).resolve() | ||
| except (IndexError, TypeError) as e: | ||
| raise ValueError(f"Input specification error: {e}") from e | ||
|
|
||
| # Get output file if specified | ||
| OUTPUT_FILE = None | ||
| if snakemake.output: | ||
| OUTPUT_FILE = Path(snakemake.output[0]).resolve() | ||
|
|
||
| # Get repeat_file (required) | ||
| try: | ||
| if not hasattr(snakemake.params, "repeat_file"): | ||
| raise ValueError("Parameter 'repeat_file' is required for extract") | ||
| repeat_file = Path(snakemake.params.repeat_file).resolve() | ||
| except (AttributeError, ValueError) as e: | ||
| raise RuntimeError(f"Parameter validation failed: {e}") from e | ||
|
|
||
| # Build parameters | ||
| params = [f"-r {repeat_file}"] | ||
|
|
||
| try: | ||
| if hasattr(snakemake.params, "out_format"): | ||
| params.append(f"-f {snakemake.params.out_format}") | ||
|
|
||
| if hasattr(snakemake.params, "flank_length"): | ||
| params.append(f"-l {snakemake.params.flank_length}") | ||
| except (AttributeError, ValueError) as e: | ||
| raise RuntimeError(f"Parameter processing failed: {e}") from e | ||
|
|
||
| # Build command | ||
| CMD = f"pytrf extract {input_file}" | ||
| if params: | ||
| CMD += " " + " ".join(params) | ||
| if OUTPUT_FILE: | ||
| CMD += f" -o {OUTPUT_FILE}" | ||
|
|
||
| # Execute | ||
| try: | ||
| shell(f"{CMD} {log}") | ||
| except Exception as e: | ||
| raise RuntimeError(f"PyTRF extract execution failed: {e}") from e | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,37 @@ | ||
| # This file may be used to create an environment using: | ||
| # $ conda create --name <env> --file <this file> | ||
| # platform: linux-64 | ||
| # created-by: conda 25.11.0 | ||
| @EXPLICIT | ||
| https://conda.anaconda.org/conda-forge/linux-64/_libgcc_mutex-0.1-conda_forge.tar.bz2#d7c89558ba9fa0495403155b64376d81 | ||
| https://conda.anaconda.org/conda-forge/noarch/ca-certificates-2025.11.12-hbd8a1cb_0.conda#f0991f0f84902f6b6009b4d2350a83aa | ||
| https://conda.anaconda.org/conda-forge/linux-64/libgomp-15.2.0-he0feb66_14.conda#91349c276f84f590487e4c7f6e90e077 | ||
| https://conda.anaconda.org/conda-forge/noarch/python_abi-3.12-8_cp312.conda#c3efd25ac4d74b1584d2f7a57195ddf1 | ||
| https://conda.anaconda.org/conda-forge/noarch/tzdata-2025b-h78e105d_0.conda#4222072737ccff51314b5ece9c7d6f5a | ||
| https://conda.anaconda.org/conda-forge/linux-64/_openmp_mutex-4.5-2_gnu.tar.bz2#73aaf86a425cc6e73fcf236a5a46396d | ||
| https://conda.anaconda.org/conda-forge/linux-64/libgcc-15.2.0-he0feb66_14.conda#550dceb769d23bcf0e2f97fd4062d720 | ||
| https://conda.anaconda.org/conda-forge/linux-64/bzip2-1.0.8-hda65f42_8.conda#51a19bba1b8ebfb60df25cde030b7ebc | ||
| https://conda.anaconda.org/conda-forge/linux-64/libexpat-2.7.3-hecca717_0.conda#8b09ae86839581147ef2e5c5e229d164 | ||
| https://conda.anaconda.org/conda-forge/linux-64/libffi-3.5.2-h9ec8514_0.conda#35f29eec58405aaf55e01cb470d8c26a | ||
| https://conda.anaconda.org/conda-forge/linux-64/libgcc-ng-15.2.0-h69a702a_14.conda#6c13aaae36d7514f28bd5544da1a7bb8 | ||
| https://conda.anaconda.org/conda-forge/linux-64/liblzma-5.8.1-hb9d3cd8_2.conda#1a580f7796c7bf6393fddb8bbbde58dc | ||
| https://conda.anaconda.org/conda-forge/linux-64/libnsl-2.0.1-hb9d3cd8_1.conda#d864d34357c3b65a4b731f78c0801dc4 | ||
| https://conda.anaconda.org/conda-forge/linux-64/libstdcxx-15.2.0-h934c35e_14.conda#8e96fe9b17d5871b5cf9d312cab832f6 | ||
| https://conda.anaconda.org/conda-forge/linux-64/libuuid-2.41.2-he9a06e4_0.conda#80c07c68d2f6870250959dcc95b209d1 | ||
| https://conda.anaconda.org/conda-forge/linux-64/libzlib-1.3.1-hb9d3cd8_2.conda#edb0dca6bc32e4f4789199455a1dbeb8 | ||
| https://conda.anaconda.org/conda-forge/linux-64/ncurses-6.5-h2d0b736_3.conda#47e340acb35de30501a76c7c799c41d7 | ||
| https://conda.anaconda.org/conda-forge/linux-64/openssl-3.6.0-h26f9b46_0.conda#9ee58d5c534af06558933af3c845a780 | ||
| https://conda.anaconda.org/conda-forge/linux-64/libstdcxx-ng-15.2.0-hdf11a46_14.conda#9531f671a13eec0597941fa19e489b96 | ||
| https://conda.anaconda.org/conda-forge/linux-64/libxcrypt-4.4.36-hd590300_1.conda#5aa797f8787fe7a17d1b0821485b5adc | ||
| https://conda.anaconda.org/conda-forge/linux-64/readline-8.2-h8c095d6_2.conda#283b96675859b20a825f8fa30f311446 | ||
| https://conda.anaconda.org/conda-forge/linux-64/tk-8.6.13-noxft_ha0e22de_103.conda#86bc20552bf46075e3d92b67f089172d | ||
| https://conda.anaconda.org/conda-forge/linux-64/zstd-1.5.7-hb8e6e7a_2.conda#6432cb5d4ac0046c3ac0a8a0f95842f9 | ||
| https://conda.anaconda.org/conda-forge/linux-64/icu-75.1-he02047a_0.conda#8b189310083baabfb622af68fd9d3ae3 | ||
| https://conda.anaconda.org/conda-forge/linux-64/ld_impl_linux-64-2.45-default_hbd61a6d_104.conda#a6abd2796fc332536735f68ba23f7901 | ||
| https://conda.anaconda.org/conda-forge/linux-64/libsqlite-3.51.0-hee844dc_0.conda#729a572a3ebb8c43933b30edcc628ceb | ||
| https://conda.anaconda.org/conda-forge/linux-64/python-3.12.12-hd63d673_1_cpython.conda#5c00c8cea14ee8d02941cab9121dce41 | ||
| https://conda.anaconda.org/bioconda/linux-64/pyfastx-2.2.0-py312h4711d71_1.tar.bz2#0c029565f5abbf1c3349a4abc0b4c63c | ||
| https://conda.anaconda.org/bioconda/linux-64/pytrf-1.4.2-py312h0fa9677_0.tar.bz2#11c47fcb88ad7fe0ab94dcf11b8bebb9 | ||
| https://conda.anaconda.org/conda-forge/noarch/setuptools-80.9.0-pyhff2d567_0.conda#4de79c071274a53dcaf2a8c749d1499e | ||
| https://conda.anaconda.org/conda-forge/noarch/wheel-0.45.1-pyhd8ed1ab_1.conda#75cb7132eb58d97896e173ef12ac9986 | ||
| https://conda.anaconda.org/conda-forge/noarch/pip-25.3-pyh8b19718_0.conda#c55515ca43c6444d2572e0f0d93cb6b9 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,8 @@ | ||
| channels: | ||
| - conda-forge | ||
| - bioconda | ||
| - nodefaults | ||
|
|
||
| dependencies: | ||
| - pytrf =1.4 | ||
| - pyfastx =2.2 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,49 @@ | ||
| name: pytrf findatr | ||
| description: > | ||
| Find approximate/imperfect tandem repeats from DNA sequences. | ||
| url: https://pytrf.readthedocs.io/en/latest/usage.html#commandline-interface | ||
| authors: | ||
| - Muhammad Rohan Ali Asmat | ||
| input: | ||
| - FASTA or FASTQ file (supports gzip compression) | ||
| output: | ||
| - Output file (default -> stdout, will be redirected to the log file). | ||
| params: | ||
| out_format: > | ||
| Output format. Options: 'tsv' (default), 'csv', 'bed', or 'gff'. | ||
| min_motif: > | ||
| Minimum motif size in bp (default: 1). | ||
| max_motif: > | ||
| Maximum motif size in bp (default: 6). | ||
| min_seedrep: > | ||
| Minimum repeat number for seed (default: 3). | ||
| min_seedlen: > | ||
| Minimum length for seed (default: 10). | ||
| max_errors: > | ||
| Maximum number of continuous alignment errors (default: 3). | ||
| min_identity: > | ||
| Minimum identity for extending, 0 to 100 (default: 70). | ||
| max_extend: > | ||
| Maximum length allowed to extend (default: 2000). | ||
| notes: > | ||
| **Output columns (TSV/CSV/BED/GFF):** sequence or chromosome name, start position, | ||
| end position, motif sequence, motif length, repeat number, repeat length, seed start | ||
| position, seed end position, seed repeat number, seed length, number of matches, | ||
| number of substitutions, number of insertions, number of deletions, extend alignment | ||
| identity between imperfect repeat and its perfect counterpart. | ||
| |nl| |nl| | ||
| **Example:** |nl| | ||
| Example row in record: 0 1 32 T 1 32.0 32 1 1 1 1 10 22 0 0 31.25 |nl| | ||
| This indicates that in sequence '0', from position 1 to 32, there is a tandem repeat | ||
| with motif 'T' (length 1) repeated 32 times, resulting in a repeat length of 32 bp. | ||
| The seed repeat started at position 1 and ended at position 1, with a seed repeat number of 1 | ||
| and seed length of 1 bp. The alignment of the imperfect repeat to its perfect counterpart | ||
| has 10 matches, 22 substitutions, 0 insertions, and 0 deletions, yielding an identity of 31.25%. | ||
| |nl| | ||
| |nl| | ||
| **Bioconda package:** https://bioconda.github.io/recipes/pytrf/README.html |nl| | ||
| **GitHub repository:** https://github.com/lmdu/pytrf |nl| | ||
| **License:** MIT License |nl| | ||
| **Disclaimer:** This is a minimal implementation supporting basic functionality. | ||
| pytrf is not a Python binding to TRF - it's an independent tool. | ||
|
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,20 @@ | ||
| # SAMPLE RULE: Find approximate/imperfect tandem repeats | ||
| # | ||
| # Output: | ||
| # - If output file is specified, results are written to that file | ||
| # - If output is omitted, PyTRF writes to stdout (redirected to log file) | ||
| # | ||
| # This example searches for approximate repeats with motif sizes between 3-10 bp, | ||
| # allowing detection of imperfect short/medium tandem repeats with mismatches. | ||
| rule pytrf_findatr: | ||
| input: | ||
| "demo_data/{sample}.fasta", | ||
| output: | ||
| "results/{sample}.tsv", | ||
| log: | ||
| "logs/{sample}.log", | ||
| params: | ||
| min_motif=3, | ||
| max_motif=10, | ||
| wrapper: | ||
| "master/bio/pytrf/findatr" |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,2 @@ | ||
| >seq1 | ||
| TCATCGGTCATCGGTCATCGGTCATCGGTCATCGG |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1 @@ | ||
| seq1 1 35 TCATCGG 7 5.0 35 1 35 5 35 35 0 0 0 100.0 |
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are these parameters mandatory?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Flank length and output format are not mandatory