Skip to content

Filter MAT before chronumental #9

@Haddox

Description

@Haddox

It looks like errors in estimated dates are leading to errors in mutation counts over time. This appears to be skewing some fitness trajectories.

A few things that could be leading to errors in estimated dates include:

  • leaves that only have an estimated year (e.g., 2020)
  • leaves that have many mutations on the branch leading to them (e.g., from chronic infection)

What if we just remove the above set of leaves from the MAT before estimating dates with chronumental? My impression is that this would only remove a small fraction of sequences. The filtering criteria would be to remove leaves that:

  • only have an estimated year (e.g., 2020) and no month or day information
  • have more than four nucleotide mutations on the branch leading to them

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions