Conversation
Tools to filter, split, subset and other VCF files manipulation
vcftoolbox
vcftoolbox
|
@GINAMO-EBVs thanks a lot! Have you seen that bcftools are already available in Galaxy? Is there any functionality missing? https://github.com/galaxyproject/tools-iuc/tree/main/tools/bcftools there are also a lot of other VCF related tools already in the Galaxy toolshed. |
|
Youhou! Hi Laura and Ginamo team ! THANK you for initiating the PR! AFAIK you already checked existing tools dealing with vcf, but we can maybe dive into it together! With @PaulineSGN we will test your tools and investigate also on our side the potentiel overlaps with existing tools ! |
Vcftoolbox
|
@bgruening I am aware that bcftools is already available on Galaxy. The tools I propose are easier to use and provide a summary of the different filters applied, in addition to retaining the input file name. I have made a few changes so that all the filtering tools are in a single tool. I have also removed one of the tools that overlaps with one of the bcftools. |
bgruening
left a comment
There was a problem hiding this comment.
- please use required_files https://docs.galaxyproject.org/en/latest/dev/schema.html#tool-required-files-include
- please consider using macros. With macros you can save a lot of redundant code. E.g. move the citations, the requirements etc all to the macro. Also a few prams can be moved to macros.
Co-authored-by: Björn Grüning <bjoern@gruenings.eu>
bgruening
left a comment
There was a problem hiding this comment.
- please add
<required_files> - bcf is not supported as input?
- <required_files> add - bcf not supported because vcftools is used for certain parts - citation type=doi - macros with citation, requirement, input - change pattern discover_dataset - update indexation
- remove count="1" - correction help VCF_keep_remove
|
@GINAMO-EBVs not sure if relevant for you but maybe something is useful in this restructured bash file: |
|
@bgruening Thank your for your proposition. I applied it on VCF_keep_remove.sh and partially in the other scripts. |
bgruening
left a comment
There was a problem hiding this comment.
Cool, glad it was useful, my bash is a bit rusty :)
|
|
||
| <required_files> | ||
| <include path="VCF_keep_remove_ind.R" /> | ||
| <include path="VCF_keep_remove_ind_v2.sh" /> |
There was a problem hiding this comment.
the v2 is not called in this tool?
There was a problem hiding this comment.
Sorry, a backup that didn't work properly – there isn't a v2
|
please look at this page for a summary of the failing test: https://github.com/galaxyecology/tools-ecology/actions/runs/23291952174?pr=226 |
bgruening
left a comment
There was a problem hiding this comment.
@GINAMO-EBVs thanks a lot. I think those are my final comments. Thanks a lot.
Change name of tools adding population_genomics as prefix
|
Merci!!! |
VCF Toolbox is a suite of tools for filtering and subsetting VCF files used in population genomics. The toolbox allows users to filter SNPs by read depth, genotype quality, missing data, heterozygosity, and minor allele count, extract or remove individuals, split VCFs by population and perform subsampling.