|
| 1 | +**Core Infrastructure** |
| 2 | +using Async where possible, parsing bioinformatic file types |
| 3 | +- [x] Genbank parser |
| 4 | +- [x] Embl parser |
| 5 | +- [x] Conversions of formats into gbk/embl/fasta/ffn/faa and save as gff/gbk or embl |
| 6 | +- [x] CI/CD with github actions and write tests - ongoing |
| 7 | +- [x] Documentation with examples - ongoing |
| 8 | + |
| 9 | +**Format expansion** |
| 10 | +Integration of common types and parsers such as: |
| 11 | +- [ ] VCF |
| 12 | +- [ ] BLAST output in various modes, -5 (XML) and -6 (one line per hit), compat with Diamond, MMSeqs2 |
| 13 | +- [ ] GTF |
| 14 | +- [ ] link ups with formats from **rust-bio**, **noodles-vcf** |
| 15 | +- [ ] look into metabolomics options (like KEGG) some are not open-source |
| 16 | +- [ ] fastq gzipped version parsing |
| 17 | +- [ ] SAM format parser |
| 18 | +- [ ] RPKM output parser |
| 19 | +- [ ] Support for other compressed files such as BAM and CRAM |
| 20 | +- [ ] Writer support for those types |
| 21 | + |
| 22 | +**Advanced features** |
| 23 | +Protein parameters such as: |
| 24 | +- [x] Hydrophobicity |
| 25 | +- [x] Molecular weight |
| 26 | +- [ ] Counted amino acids & as percentage |
| 27 | +- [x] Aromaticity |
| 28 | +- [ ] Parsing phylogenetic trees |
| 29 | +- [ ] Parsing multiple sequence alignments |
| 30 | +- [ ] Methylation data parsing |
| 31 | +- [ ] Consider pyo3 integration |
| 32 | + |
| 33 | +**Testing - bioinformatics** |
| 34 | +- [ ] Testing & updating parsers with various edge case files such as gbk with the CONTIG(JOIN) structure |
| 35 | + |
| 36 | +**Data Viz** |
| 37 | +- [x] Example heatmap with WASM and Js example |
| 38 | +- [ ] Simple graph types bar, line, scatter |
0 commit comments