Conversation
|
WIP ... Some thoughts on the lower part of #1792 (comment), as in these are some standard use cases and an attempt to produce each of these with ease. While thinking about these I struck me that the Stabilizing selectionDirectional selectionTODO: Discuss how to handle multiple traits here. Truncation selectionTODO: Discuss how to handle multiple traits here - the above assumes a single trait. Truncation selection (on a binary trait)The same as directional selection above, but applied to 0/1 scores? Direct effects on fitness (previous behavior)TODO: This is just a special case of directional selection on a trait, NeutralDo nothing? |
|
Okay - by my reading, |
|
Okay, so we might also decide that DFE and DME (or whatever we are calling it) are the same thing, but I'm voting not to do this, because:
and so this points towards DFE being a sub-class of DME. |
|
Having consulted with @roshnipatel, I've gone ahead and done those last two things (we can un-do them, however). And, I've moved everything to |
|
Discussing with @gregorgorjanc today, we realized that in practice the "threshold" fitness function should not take a fixed value, it should take a fixed quantile (e.g., "fitness 1 to the top 20% of the population; fitness 0 to the rest"). |
|
And consulting with @roshnipatel we realized that basically everything needs IDs, so that when we put together a model we can properly debug it (ie check things went where we thought they should). |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #1792 +/- ##
========================================
Coverage 99.81% 99.82%
========================================
Files 142 143 +1
Lines 4873 5012 +139
Branches 472 513 +41
========================================
+ Hits 4864 5003 +139
Misses 6 6
Partials 3 3 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
I'm thinking that the plan here should be:
Here's brainstorming of those issues, besides the checklist in the original post here:
So, this might be close, but we need (a) some eyes on it, and (b) to sketch out that higher-level API, to make sure? |
|
I've added @gregorgorjanc's edits from petrelharp#4 and realized I forgot to commit |
|
Okay, consulting with @roshnipatel, here's a proposal for the high-level API: ( Rationale:
If this sounds good, more TODOs:
|
|
seems like a starter list of very-basic use cases might be something like:
|
|
Just to be clear here, this is the same list that @gregorgorjanc started above, right? So, this is a proposal for what the methods and arguments are? |
@roshnipatel: after talking with @petrelharp, I think we should separate two mechanisms that are often conflated in breeding contexts:
We could parameterise directional selection like So maybe API-wise:
@petrelharp if you recall any canonical references for this formulation, that would help us pick default forms, language etc. Thoughts? |
|
I guess we should also add |
|
I'd have to go look at the papers that are out there using "directional selection", but my memory is that "linear" is most usual? Note that "linear with a multiplicative trait" and "exponential with an attidive trait" are the same thing, and "linear with an additive trait" runs the risk of negative fitnesses, so my vote is just to have one of these. |
There was a problem hiding this comment.
@jeffspence and I did a code review together and had some comments:
- There's some ambiguity around whether multivariate fixed effects are allowed in
MutationType-- given lines 740-744, it seems they ought to be, and we should provide that functionality/make it explicit in the docstring _check_args_listis pretty inflexible and we might eventually want to provide distribution params that do not match the total number of dimensions (e.g. weights for a mixture of Gaussians) -- I think we ought to remove this function and outsource args-checking to the relevant functions (i.e._check_distributionand theFitnessFunctioninitialization)- Also, the
dimargument is not being used at the moment
- Also, the
- A longer conversation for later:
MutationTypeallows sampling from a distribution that could give you negative fitness (which might have been less of an issue when DFEs were largely catalog-provided, but feels like something to address in our new era of user-generated distributions)
Minor notes:
- We ought to check for nonempty trait/environment/fitness IDs
- Re: line 469 I think I agree this should happen in
slim_engine.py - We might want to expand the list of neutral DFEs in
MutationType.is_neutral( ) - Are we requiring that a multivariate normal distribution has 2+ dimensions? (We don't seem to explicitly check this anywhere...)
- Line 351 has an incomplete docstring (but I think we actually have decided what it's supposed to be?)
test_traits.pyrequires function arguments are provided as lists, but that seems rather strict -- maybe we should just check that things can be cast to a numpy array?
Thanks!
Good point; I adjusted the docstring. However, I wasn't thinking of this as final, as a TODO is to look at making other types multivariate, and resolving how the gaussians are parameterized (see lists above).
Agree! tidying up the "checking" code is a TODO.
Good catch. I'm leaving that as part of the "tidy the checking code" TODO.
Ah yep. Will add to the TODO list.
Ah, sure. Done.
Great, that's in the TODO list.
Also in the TODO list.
No? I don't see why we would?
Maybe? See the TODO above pointing out that we parameterize the MVN with variance and the standard Gaussian with the SD; maybe this is fine, but it seems awkward.
Agree; this is in the TODO list. |
|
I'm going to merge this so it doesn't get in the way of others, but still TODO for me is to make issues for the things above. |




This started with the code from @jeffspence's
masterbranch that he and @roshnipatel got together.Proposed rough low-level API:
Note: we could alternatively instantiate
EnvironmentandFitnessFunctions to pass into theTraitconstructor,but these have to refer to "which traits do I work with", so it's cleaner to have the traits set up
before we make them.
Notes:
TraitsModel:Encompasses all the stuff: how they are constructed, how they change with time. (Thus, "model".)
Has
Traits, each of which has a link functionEnvironments (each of which can affect a collection of the traits)FitnessFunctions (each of which operate on a collection of the traits)FitnessFunctionHas:
EnvironmentThis affects how the genetic value maps to trait phenotypic values,
including the part that changes with time/space.
Environments can affect more than one trait (so you have correlated noise).
Possibly not public?
Adds "noise" to the genetic value. Has:
Trait
This describes how to produce the observed biological trait value (ie, the phenotype):
for instance, describes how disease liability translates to disease incidence,
so it should not change with time or location.
Link functions map only a single trait, independently of others.
GeneticValueTransform?PhenotypeTransform? Well, this applies to "genetic value plus environment" so it's not a "genetic value transform"? Something else?This transforms the underlying (genetic value plus environemtn) to the "observed trait". Has:
Examples:
DistributionOfMutationEffectsMaps mutations to traits, so has
MutationTypesMutationTypeA (biologically motivated, hopefully) class of mutations. As before, but now there's more than one trait, so also needs to know:
"fitness")List of standard use cases:
See below
We should make it very easy to produce one of each of these,
something like
stdpopsim.traits.StabilizingSelectionOnTrait(n, sigma)Miscellaneous TODOs that may require getting into some weeds, and should turn into issues when we merge this:
convert_to_substitutionedit: I don't think we need an issue for this_check_distributionand related code)Q) only for fitness-affecting mutation effects disallow rescaling for traits #1818slim_engine.pymove code that changes lognormal and uniform to Eidos toslim_engine.py#1819mvndistribution type since we already have Gaussian; but the existingdistribution_type="n"uses a (mean, sd) parameterization, while themvntype uses (mean, (co)variance); maybe this is just how things has to be but this needs to be super clear if sodo we need to keep theedit: probably not but let's leave it for backwards compatibilitystdpopsim.dfemodule?contig.is_neutral()method; this previously just looked atMutationTypes but now needs to include theTraitsModel. revsiitcontig.is_neutral()method #1820