Skip to content

hmmalign with --outformat a2m discards padding #345

@althonos

Description

@althonos

Tested with HMMER 3.4.

Using hmmalign with --outformat a2m discards the padding from the FASTA output, which means it cannot be loaded back. Example with the given DNA sequences and HMM:

hmmalign --outformat a2m 16S.hmm 16S.fna

Output:

Details
>NC_016617:538069..539152
ttgaac----CTGAGAGTTTGATCCTGGCTCAGAACGAACGCTGGCGGCATGCCTAACAC
ATGCAAGTCGAACGAGG-----GCTTCGGC------CCTAGTGGCGCACGGGTGAGTAAC
ACGTGGGA-ACCTGCCTTTCGGTTCGGGATAACGTCTGGAAACGGACGCTAACACCGGAT
ACG-------TCCTTCGGGA-------GAAAGTT-----TACGCCGAGAGAGGGGCCCGC
GTCCGATTAGGTAGTTGGTGGGGTAATGGCCCACCAAGCCGACGATCGGTAGCTGGTCTG
AGAGGATGATCAGCCACACTGGGACTGAGACACGGCCCAGACTCCTACGGGAGGCAGCAG
TGGGGAATATTGGACAATGGGGGCAACCCTGATCCAGCAATGCCGCGTGAGTGATGAAGG
CCTTAGGGTTGTAAAGCTCTTTCGCACGCGACGATG-A-----------------TGACG
GTAGCGTGAGAAGAAGCCCCGGCTAACTTCGTGCCAGCAGCCGCGGTAATACGAAGGGGG
CGAGCGTTGTTCGGAATTACTGGGCGTAAAGGGCGCGTAGGCGGCCTGTTTAGTCAGAAG
TGAAAGCCCCGGGCTTAACCTGGGAACGGCTTTTGATACTGGCAGGCTTGAGTTCCGGAG
AGGATGGTGGAATTCCCAGTGTAGAGGTGAAATTCGTAGATATTCGGAAGAACACCGGTG
GCGAAGGCGGCCATCTGGACGGACACCTACGACGAGGCGCGAAAGCGTGGGGAGCAAACA
GGATTAGATACCCTGCTAGTGCCCGCGGTAAACGATTAATCCTAAACGATGGGGTGC-AT
GCACTTCGGTGTGGACATTAACCCAGTAAGCATTCCGCCTGGGGAGTACGGCCGCAAGGT
TCAAACTCAAATGAATTGACGGGGGCACGCACAAGCGATCGAGCATGAGGTCTAAGTGGA
AGCAACGTGCAGAACCATACCAACCCAGGACACGCACGAcaCCGGCACCAGAGAGGGAGG
GAaCAGCTCCGGAGGGAGGAACACAGGCGCCCCATGACGGTCGACAGCTGGTGTGGCGAG
AGTCAGGGATAAGTCGAGCATGCAGCCCAAACCCTTCAGCCAGTTTCCATCC--------
------------------------------------------------------------
------------------------------------------------------------
------------------------------------------------------------
------------------------------------------------------------
------------------------------------------------------------
------------------------------------------------------------
------------------------------------------tgcgg
>NC_016594:913476..914972
ttgaac----CTGAGAGTTTGATCCTGGCTCAGAACGAACGCTGGCGGCATGCCTAACAC
ATGCAAGTCGAACGAAG-----GCTTCGGC------CTTAGTGGCGCACGGGTGAGTAAC
ACGTGGGA-ACCTGCCTTTCGGTTCGGGATAACGTCTGGAAACGGACGCTAACACCGGAT
ACG-------TCCTTCGGGA-------GAAAGTT-----TACGCCGAGAGAGGGGCCCGC
GTCCGATTAGGTAGTTGGTGGGGTAATGGCCCACCAAGCCGACGATCGGTAGCTGGTCTG
AGAGGATGATCAGCCACACTGGGACTGAGACACGGCCCAGACTCCTACGGGAGGCAGCAG
TGGGGAATATTGGACAATGGGGGCAACCCTGATCCAGCAATGCCGCGTGAGTGATGAAGG
CCTTAGGGTTGTAAAGCTCTTTCGCACGCGACGATG-A-----------------TGACG
GTAGCGTGAGAAGAAGCCCCGGCTAACTTCGTGCCAGCAGCCGCGGTAATACGAAGGGGG
CGAGCGTTGTTCGGAATTACTGGGCGTAAAGGGCGCGTAGGCGGCCTGTTTAGTCAGAAG
TGAAAGCCCCGGGCTTAACCTGGGAACGGCTTTTGATACTGGCAGGCTTGAGTTCCGGAG
AGGATGGTGGAATTCCCAGTGTAGAGGTGAAATTCGTAGATATTGGGAAGAACACCGGTG
GCGAAGGCGGCCATCTGGACGGACACTGACGCTGAGGCGCGAAAGCGTGGGGAGCAAACA
GGATTAGATACCCTGGTAGTCCACGCCGTAAACGATGAATGCTAGACGCTGGGGTGC-AT
GCACTTCGGTGTCGCCGCTAACGCATTAAGCATTCCGCCTGGGGAGTACGGCCGCAAGGT
TAAAACTCAAAGGAATTGACGGGGGCCCGCACAAGCGGTGGAGCATGTGGTTTAATTCGA
AGCAACGCGCAGAACCTTACCAACCCTTGACATGTCCACcaCCGGCTCCAGAGATGGAGC
TTTCAGTTcGGCTGGGTGGAACACAGGTGCTGCATGGCTGTCGTCAGCTCGTGTCGTGAG
ATGTTGGGTTAAGTCCCGCAACGAGCGCAACCCCTACCGCCAGTTGCCATCAT-TC-AGT
TGGGCACTCTGGTGGAACTGCCGGTGACAAGCCGGAGGAAGGCGGGGATGACGTCAAGTC
CTCATGGCCCTTATGGGTTGGGCTACACACGTGCTACAATGGCGGTGACAGTGGGATGCG
AAGTCGCAAGATGGAGCCAATCCCC-AAAAGCCGTCTCAGTTCGGATTGCACTCTGCAAC
TCGGGTGCATGAAGTTGGAATCGCTAGTAATCGCGGATCAGCACGCCGCGGTGAATACGT
TCCCGGGCCTTGTACACACCGCCCGTCACACCATGGGAGTTGGCTTTACCCGAAGGTGGT
GCGCTAACCcGCAAgGGAGGCAGCCAACCACGGTCAGGTCAGCGACTGGGGTGAAGTCGT
AACAAGGTAGCCGTAGGGGAACCTGCGGCTGGATCACCTCCTT-tctaaggaaa
>NC_016594:1099329..1100825
tgaac----CTGAGAGTTTGATCCTGGCTCAGAACGAACGCTGGCGGCATGCCTAACACA
TGCAAGTCGAACGAGG-----GCTTCGGC------CCTAGTGGCGCACGGGTGAGTAACA
CGTGGGA-ACCTGCCTTTCGGTTCGGGATAACGTCTGGAAACGGACGCTAACACCGGATA
CG-------TCCTTCGGGA-------GAAAGTT-----TACGCCGAGAGAGGGGCCCGCG
TCCGATTAGGTAGTTGGTGGGGTAATGGCCCACCAAGCCGACGATCGGTAGCTGGTCTGA
GAGGATGATCAGCCACACTGGGACTGAGACACGGCCCAGACTCCTACGGGAGGCAGCAGT
GGGGAATATTGGACAATGGGGGCAACCCTGATCCAGCAATGCCGCGTGAGTGATGAAGGC
CTTAGGGTTGTAAAGCTCTTTCGCACGCGACGATG-A-----------------TGACGG
TAGCGTGAGAAGAAGCCCCGGCTAACTTCGTGCCAGCAGCCGCGGTAATACGAAGGGGGC
GAGCGTTGTTCGGAATTACTGGGCGTAAAGGGCGCGTAGGCGGCCTGTTTAGTCAGAAGT
GAAAGCCCCGGGCTTAACCTGGGAACGGCTTTTGATACTGGCAGGCTTGAGTTCCGGAGA
GGATGGTGGAATTCCCAGTGTAGAGGTGAAATTCGTAGATATTGGGAAGAACACCGGTGG
CGAAGGCGGCCATCTGGACGGACACTGACGCTGAGGCGCGAAAGCGTGGGGAGCAAACAG
GATTAGATACCCTGGTAGTCCACGCCGTAAACGATGAATGCTAGACGCTGGGGTGC-ATG
CACTTCGGTGTCGCCGCTAACGCATTAAGCATTCCGCCTGGGGAGTACGGCCGCAAGGTT
AAAACTCAAAGGAATTGACGGGGGCCCGCACAAGCGGTGGAGCATGTGGTTTAATTCGAA
GCAACGCGCAGAACCTTACCAACCCTTGACATGTCCACcaCCGGCTCCAGAGATGGAGCT
TTCAGTTcGGCTGGGTGGAACACAGGTGCTGCATGGCTGTCGTCAGCTCGTGTCGTGAGA
TGTTGGGTTAAGTCCCGCAACGAGCGCAACCCCTACCGCCAGTTGCCATCAT-TC-AGTT
GGGCACTCTGGTGGAACTGCCGGTGACAAGCCGGAGGAAGGCGGGGATGACGTCAAGTCC
TCATGGCCCTTATGGGTTGGGCTACACACGTGCTACAATGGCGGTGACAGTGGGACGCGA
AGTCGCAAGATGGAGCCAATCCCC-AAAAGCCGTCTCAGTTCGGATTGCACTCTGCAACT
CGGGTGCATGAAGTTGGAATCGCTAGTAATCGCGGATCAGCACGCCGCGGTGAATACGTT
CCCGGGCCTTGTACACACCGCCCGTCACACCATGGGAGTTGGCTTTACCCGAAGGTGGTG
CGCTAACCgGCAAcGGAGGCAGCCAACCACGGTCAGGTCAGCGACTGGGGTGAAGTCGTA
ACAAGGTAGCCGTAGGGGAACCTGCGGCTGGATCACCTCCTT-tctaaggaaaa
>NC_016594:1578551..1580047
tgaac----CTGAGAGTTTGATCCTGGCTCAGAACGAACGCTGGCGGCATGCCTAACACA
TGCAAGTCGAACGAAG-----GCTTCGGC------CTTAGTGGCGCACGGGTGAGTAACA
CGTGGGA-ACCTGCCTTTCGGTTCGGGATAACGTCTGGAAACGGACGCTAACACCGGATA
CG-------TCCTTCGGGA-------GAAAGTT-----TACGCCGAGAGAGGGGCCCGCG
TCCGATTAGGTAGTTGGTGGGGTAATGGCCCACCAAGCCGACGATCGGTAGCTGGTCTGA
GAGGATGATCAGCCACACTGGGACTGAGACACGGCCCAGACTCCTACGGGAGGCAGCAGT
GGGGAATATTGGACAATGGGGGCAACCCTGATCCAGCAATGCCGCGTGAGTGATGAAGGC
CTTAGGGTTGTAAAGCTCTTTCGCACGCGACGATG-A-----------------TGACGG
TAGCGTGAGAAGAAGCCCCGGCTAACTTCGTGCCAGCAGCCGCGGTAATACGAAGGGGGC
GAGCGTTGTTCGGAATTACTGGGCGTAAAGGGCGCGTAGGCGGCCTGTTTAGTCAGAAGT
GAAAGCCCCGGGCTTAACCTGGGAACGGCTTTTGATACTGGCAGGCTTGAGTTCCGGAGA
GGATGGTGGAATTCCCAGTGTAGAGGTGAAATTCGTAGATATTGGGAAGAACACCGGTGG
CGAAGGCGGCCATCTGGACGGACACTGACGCTGAGGCGCGAAAGCGTGGGGAGCAAACAG
GATTAGATACCCTGGTAGTCCACGCCGTAAACGATGAATGCTAGACGCTGGGGTGC-ATG
CACTTCGGTGTCGCCGCTAACGCATTAAGCATTCCGCCTGGGGAGTACGGCCGCAAGGTT
AAAACTCAAAGGAATTGACGGGGGCCCGCACAAGCGGTGGAGCATGTGGTTTAATTCGAA
GCAACGCGCAGAACCTTACCAACCCTTGACATGTCCACtaTCGGCTCGAGAGATCGGGCT
TTCAGTTcGGCTGGGTGGAACACAGGTGCTGCATGGCTGTCGTCAGCTCGTGTCGTGAGA
TGTTGGGTTAAGTCCCGCAACGAGCGCAACCCCTACCGCCAGTTGCCATCAT-TC-AGTT
GGGCACTCTGGTGGAACTGCCGGTGACAAGCCGGAGGAAGGCGGGGATGACGTCAAGTCC
TCATGGCCCTTATGGGTTGGGCTACACACGTGCTACAATGGCGGTGACAGTGGGATGCGA
AGTCGCAAGATGGAGCCAATCCCC-AAAAGCCGTCTCAGTTCGGATTGCACTCTGCAACT
CGGGTGCATGAAGTTGGAATCGCTAGTAATCGCGGATCAGCACGCCGCGGTGAATACGTT
CCCGGGCCTTGTACACACCGCCCGTCACACCATGGGAGTTGGCTTTACCCGAAGGTGGTG
CGCTAACCgGCAAcGGAGGCAGCCAACCACGGTCAGGTCAGCGACTGGGGTGAAGTCGTA
ACAAGGTACCCGTAGGGGAACCTGCGGCTGGATCACCTCCTT-tctaaggaaaa
>NC_016618:450802..452299
tgaac----CTGAGAGTTTGATCCTGGCTCAGAACGAACGCTGGCGGCATGCCTAACACA
TGCAAGTCGAACGAAG-----GCTTCGGC------CTTAGTGGCGCACGGGTGAGTAACA
CGTGGGA-ACCTGCCTTTCGGTTCGGGATAACGTCTGGAAACGGACGCTAACACCGGATA
CG-------TCCTTCGGGA-------GAAAGTT-----TACGCCGAGAGAGGGGCCCGCG
TCCGATTAGGTAGTTGGTGGGGTAATGGCCCACCAAGCCGACGATCGGTAGCTGGTCTGA
GAGGATGATCAGCCACACTGGGACTGAGACACGGCCCAGACTCCTACGGGAGGCAGCAGT
GGGGAATATTGGACAATGGGGGCAACCCTGATCCAGCAATGCCGCGTGAGTGATGAAGGC
CTTAGGGTTGTAAAGCTCTTTCGCACGCGACGATG-A-----------------TGACGG
TAGCGTGAGAAGAAGCCCCGGCTAACTTCGTGCCAGCAGCCGCGGTAATACGAAGGGGGC
GAGCGTTGTTCGGAATTACTGGGCGTAAAGGGCGCGTAGGCGGCCTGTTTAGTCAGAAGT
GAAAGCCCCGGGCTTAACCTGGGAACGGCTTTTGATACTGGCAGGCTTGAGTTCCGGAGA
GGATGGTGGAATTCCCAGTGTAGAGGTGAAATTCGTAGATATTGGGAAGAACACCGGTGG
CGAAGGCGGCCATCTGGACGGACACTGACGCTGAGGCGCGAAAGCGTGGGGAGCAAACAG
GATTAGATACCCTGGTAGTCCACGCCGTAAACGATGAATGCTAGACGCTGGGGTGC-ATG
CACTTCGGTGTCGCCGCTAACGCATTAAGCATTCCGCCTGGGGAGTACGGCCGCAAGGTT
AAAACTCAAAGGAATTGACGGGGGCCCGCACAAGCGGTGGAGCATGTGGTTTAATTCGAA
GCAACGCGCAGAACCTTACCAACCCTTGACATGTCCACcaCCGGCTCGAGAGATCGGGCT
TTCAGTTcGGCTGGGTGGAACACAGGTGCTGCATGGCTGTCGTCAGCTCGTGTCGTGAGA
TGTTGGGTTAAGTCCCGCAACGAGCGCAACCCCTACCGCCAGTTGCCATCAT-TC-AGTT
GGGCACTCTGGTGGAACTGCCGGTGACAAGCCGGAGGAAGGCGGGGATGACGTCAAGTCC
TCATGGCCCTTATGGGTTGGGCTACACACGTGCTACAATGGCGGTGACAGTGGGACGCGA
AGTCGCAAGATGGAGCCAATCCCC-AAAAGCCGTCTCAGTTCGGATTGCACTCTGCAACT
CGGGTGCATGAAGTTGGAATCGCTAGTAATCGCGGATCAGCACGCCGCGGTGAATACGTT
CCCGGGCCTTGTACACACCGCCCGTCACACCATGGGAGTTGGCTTTACCCGAAGGTGGTG
CGCTAACCcgGCAAgGGAGGCAGCCCACCACGGTCAGGTCAGCGACTGGGGTGAAGTCGT
AACAAGGTAGCCGTAGGGGAACCTGCGGCTGGATCACCGCCTT-tttctaaagaa

Test files:

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions