Duplicated code is dangerous for a lot of the same reasons magic numbers are (see #5): if something has to be changed, all sites have to be changed.
Instead, this code can be broken into functions which take the relevant parameters.
[Example][https://github.com/d1ngn1gefe1/momatools/blob/main/preproc/taxonomy_parser.py#L16-L56) just differs in filename