Skip to content

Clusty is slow to read long inputs (with named sequences) #2

Open
@apcamargo

Description

@apcamargo

I generated a large similarity table from ~1 million genomes using sourmash'es branchwater and tried to cluster it with Clusty. Clusty didn't finish reading the input after 6 hours, while pyLeiden finished reading everything within ~30 min.

Is there a reason Clusty is taking so much time to read the input?

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions