-
Notifications
You must be signed in to change notification settings - Fork 230
Optimisation interface rework #2747
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: breaking
Are you sure you want to change the base?
Conversation
|
Turing.jl documentation for PR #2747 is available at: |
fee1d69 to
fb026db
Compare
|
We are in business! julia> using Turing; @model f() = x ~ Normal()
f (generic function with 2 methods)
julia> maximum_a_posteriori(f())
ModeResult
├ estimator : MAP
├ lp : -0.9189385332046728
├ params : OrderedDict{AbstractPPL.VarName, Any} with 1 entry
│ └ x => 0.0
└ linked : true
julia> maximum_a_posteriori(f(); lb=(x=0.1,))
ModeResult
├ estimator : MAP
├ lp : -0.923938533566544
├ params : OrderedDict{AbstractPPL.VarName, Any} with 1 entry
│ └ x => 0.10000000361871234
└ linked : true
julia> maximum_a_posteriori(f(); lb=(x=0.1,), initial_params=InitFromParams((x=-0.1,)))
ERROR: Could not initialise variable x within constraints after 100 attempts
[...] |
Runtime constraint checksThis requires TuringLang/DynamicPPL.jl#1196. Also, the same error happens with ForwardDiff, it's just a pain to read the error message because it's full of using Turing
import Mooncake
@model function f()
x ~ Uniform(-5, 5)
y ~ truncated(Normal(); lower=x)
end
julia> maximum_a_posteriori(f(), lb=(; y=2.0); adtype=AutoMooncake())
ERROR: DomainError with 1.9729323974885409:
The value for variable y (1.9729323974885409) went outside of constraints (lb=2.0, ub=nothing) during optimisation.
This can happen when using constraints on a variable that has a dynamic support, e.g., `y ~ truncated(Normal(); lower=x)` where `x` is another variable in the model.
To avoid this, consider either running the optimisation in unlinked space (`link=false`) or removing the constraints.
If you are sure that this does not matter and you want to suppress this error, you can also set `check_constraints_at_runtime=false`.
Stacktrace:
[1] rrule!!(::Mooncake.CoDual{typeof(throw), Mooncake.NoFData}, args::Mooncake.CoDual{DomainError, Mooncake.FData{…}})
@ Mooncake ~/.julia/packages/Mooncake/oZKdl/src/rules/builtins.jl:1031
[2] accumulate_assume!!
@ ~/ppl/lib/src/optimisation/Optimisation.jl:166 [inlined]
[...]
julia> maximum_a_posteriori(f(), lb=(; y=2.0); adtype=AutoMooncake(), link=false)
ModeResult
├ estimator : MAP
├ lp : -0.9741722044604852
├ params : OrderedDict{AbstractPPL.VarName, Any} with 2 entries
│ ├ x => 3.5250762576156616
│ └ y => 3.5250762576156616
└ linked : false |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## breaking #2747 +/- ##
============================================
- Coverage 87.80% 84.88% -2.92%
============================================
Files 21 23 +2
Lines 1451 1383 -68
============================================
- Hits 1274 1174 -100
- Misses 177 209 +32 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
ab78aea to
a09f8ac
Compare
| using AbstractPPL: VarName | ||
| using DynamicPPL: DynamicPPL | ||
|
|
||
| # This function is shared by both MCMC and optimisation, so has to exist outside of both. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is copied over wholesale from the mcmc folder
b38bff1 to
fc52126
Compare
mhauru
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not my most thorough review, but I read the implementation and skimmed the tests. I probably won't be here to approve, but I'm happy to merge once my comments that I left now are addressed or dismissed.
| ) | ||
| satisfies_lb = lb === nothing || all(proposed_val .>= lb) | ||
| satisfies_ub = ub === nothing || all(proposed_val .<= ub) | ||
| return any(isnan, proposed_val) || (satisfies_lb && satisfies_ub) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it okay that this is true even if one element flat out fails the constraint if another element isnan?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahh, good catch.
| Get the default initialization strategy for a given sampler `spl`, i.e. how initial | ||
| parameters for sampling are chosen if not specified by the user. By default, this is | ||
| `InitFromPrior()`, which samples initial parameters from the prior distribution. | ||
| """ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could this PR have a HISTORY.md entry?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No. I mean, yes.
| lb::NTOrVNDict | ||
| ub::NTOrVNDict | ||
| init_vecs::Dict{VarName,AbstractVector} | ||
| lb_vecs::Dict{VarName,AbstractVector} | ||
| ub_vecs::Dict{VarName,AbstractVector} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could lb, ub, lb_vecs, and ub_vecs get docstrings? The first two are the untransformed constraints, the last two the transformed ones?
| return ConstraintAccumulator(acc.link, acc.lb, acc.ub) | ||
| end | ||
| function Base.copy(acc::ConstraintAccumulator) | ||
| return ConstraintAccumulator(acc.link, deepcopy(acc.lb), deepcopy(acc.ub)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does a normal copy not suffice? deepcopy is kinda evil. Also, should this not copy the other fields too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
acc.lb and acc.ub are dictionaries of varname -> values, and without deepcopy it's not ok
julia> d = Dict(:x => [1, 2])
Dict{Symbol, Vector{Int64}} with 1 entry:
:x => [1, 2]
julia> d2 = copy(d)
Dict{Symbol, Vector{Int64}} with 1 entry:
:x => [1, 2]
julia> d2[:x][2] = 3
3
julia> d
Dict{Symbol, Vector{Int64}} with 1 entry:
:x => [1, 3]There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Technically, it should not be a mutable dictionary. There's no reason why the dict would ever be mutated. The one reason why it could be mutated, is if it references an array that a user wrote at top-level, and then the user later mutates that array. I think we can get around this by putting the deepcopy inside the constructor, and then Base.copy can just alias the dict.
I think I looked into immutable dicts, but the issue is you have mutability in the values themselves, so even that wouldn't help.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Conceptually it's just easiest to deepcopy here though.
| even if the optimisation was run in linked space." | ||
| params::P | ||
| "The final log likelihood or log joint, depending on whether `MAP` or `MLE` was run. | ||
| Note that this is a *positive* log probability." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it positive, or is it just the actual log probability rather than its negation? It's only defined up to an additive constant anyway.
| - `lb::Union{NamedTuple,AbstractDict{<:VarName,<:Any}}=(;)`: a mapping from variable names | ||
| to lower bounds for the optimisation. The bounds should be provided in the original | ||
| (unlinked) space. | ||
| - `ub::Union{NamedTuple,AbstractDict{<:VarName,<:Any}}=(;)`: a mapping from variable names | ||
| to upper bounds for the optimisation. The bounds should be provided in the original | ||
| (unlinked) space. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might this docstring warrant having a paragraph about how constraints work for distributions with different types of supports and linking.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I was going to write it in the proper Turing docs itself, but yeah, it will all have to be documented.
| - `check_model::Bool=true`: if true, the model is checked for potential errors before | ||
| optimisation begins. | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
check_constraints_at_runime is missing.
| Bijectors = "76274a88-744f-5084-9051-94815aaf08c4" | ||
| Clustering = "aaaa29a8-35af-508c-8bc3-b662a17a0fe5" | ||
| Combinatorics = "861a8166-3701-5b0c-9a16-15d98fcdc6aa" | ||
| DifferentiationInterface = "a0c0ee7d-e4b9-4e03-894e-1c5f64a51d63" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a compat bound?
| # TODO(penelopeysm): This assumes that tovec and varname_leaves return the same | ||
| # number of sub-elements in the same order --- which is NOT true for things like | ||
| # Cholesky factors --- so this is a bug!!!!!! We COULD use varname_and_value_leaves | ||
| # but that would just be a different kind of bug because LDF uses tovec for | ||
| # vectorisation!!!!!! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you had an issue about this mismatch, maybe link to it here.
| Find the mode of the probability distribution of a model. | ||
| Under the hood this function calls `Optimization.solve`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe worth mentioning somewhere in the docstring that we only support box constraints, even though Optimization can do more general constraints.
Closes #2232
Closes #2363
Closes #2745
Closes #2732
Closes #1775
Closes #2735
Closes #2734
Closes #2634
This PR redesigns Turing's optimisation interface as discussed in #2634.
How to review?
The key things that are changed here are:
The fields of the
ModeResultstruct are slightly different. We get rid of the oldNamedArraythat was stored in there. Instead, users who really, really need it can generate it themselves viaStatsBase.coef(result). The primary interface for getting parameters will beresult.params, which is aDict{VarName}(obviously can be changed to VNT later).Linking is now user-controlled and depends on the
linkparameter passed in.Constraints are now passed in as either a NamedTuple or a Dict{VarName} (again pending VNT). Values are always provided in unlinked space.
Initial parameters are now supplied as an
AbstractInitStrategy. The code insrc/optimisation/init.jlwill attempt to sample initial parameters using that strategy, but will also make sure that the sampled value obeys the constraints.The most meaningful section of code is the definition of
estimate_mode, plussrc/optimisation/init.jlwhich handles everything to do with initialisation and constraints. The rest of the stuff are just tweaks to fit the new interface.