Skip to content

Conversation

@penelopeysm
Copy link
Member

@penelopeysm penelopeysm commented Jan 7, 2026

Closes #2232
Closes #2363
Closes #2745
Closes #2732
Closes #1775
Closes #2735
Closes #2734
Closes #2634

This PR redesigns Turing's optimisation interface as discussed in #2634.

How to review?

The key things that are changed here are:

  • The fields of the ModeResult struct are slightly different. We get rid of the old NamedArray that was stored in there. Instead, users who really, really need it can generate it themselves via StatsBase.coef(result). The primary interface for getting parameters will be result.params, which is a Dict{VarName} (obviously can be changed to VNT later).

  • Linking is now user-controlled and depends on the link parameter passed in.

  • Constraints are now passed in as either a NamedTuple or a Dict{VarName} (again pending VNT). Values are always provided in unlinked space.

  • Initial parameters are now supplied as an AbstractInitStrategy. The code in src/optimisation/init.jl will attempt to sample initial parameters using that strategy, but will also make sure that the sampled value obeys the constraints.

The most meaningful section of code is the definition of estimate_mode, plus src/optimisation/init.jl which handles everything to do with initialisation and constraints. The rest of the stuff are just tweaks to fit the new interface.

@github-actions
Copy link
Contributor

github-actions bot commented Jan 7, 2026

Turing.jl documentation for PR #2747 is available at:
https://TuringLang.github.io/Turing.jl/previews/PR2747/

@penelopeysm penelopeysm changed the base branch from main to breaking January 7, 2026 17:12
@penelopeysm penelopeysm force-pushed the py/opt branch 2 times, most recently from fee1d69 to fb026db Compare January 8, 2026 10:58
@penelopeysm
Copy link
Member Author

penelopeysm commented Jan 8, 2026

We are in business!

julia> using Turing; @model f() = x ~ Normal()
f (generic function with 2 methods)

julia> maximum_a_posteriori(f())
ModeResult
  ├ estimator : MAP
  ├ lp        : -0.9189385332046728
  ├ params    : OrderedDict{AbstractPPL.VarName, Any} with 1 entry
  │             └ x => 0.0
  └ linked    : true


julia> maximum_a_posteriori(f(); lb=(x=0.1,))
ModeResult
  ├ estimator : MAP
  ├ lp        : -0.923938533566544
  ├ params    : OrderedDict{AbstractPPL.VarName, Any} with 1 entry
  │             └ x => 0.10000000361871234
  └ linked    : true


julia> maximum_a_posteriori(f(); lb=(x=0.1,), initial_params=InitFromParams((x=-0.1,)))
ERROR: Could not initialise variable x within constraints after 100 attempts
[...]

@penelopeysm
Copy link
Member Author

penelopeysm commented Jan 8, 2026

Runtime constraint checks

This requires TuringLang/DynamicPPL.jl#1196. Also, the same error happens with ForwardDiff, it's just a pain to read the error message because it's full of Dual{.........} stuff.

using Turing
import Mooncake
@model function f()
    x ~ Uniform(-5, 5)
    y ~ truncated(Normal(); lower=x)
end




julia> maximum_a_posteriori(f(), lb=(; y=2.0); adtype=AutoMooncake())
ERROR: DomainError with 1.9729323974885409:

The value for variable y (1.9729323974885409) went outside of constraints (lb=2.0, ub=nothing) during optimisation.

This can happen when using constraints on a variable that has a dynamic support, e.g., `y ~ truncated(Normal(); lower=x)` where `x` is another variable in the model.

To avoid this, consider either running the optimisation in unlinked space (`link=false`) or removing the constraints.

If you are sure that this does not matter and you want to suppress this error, you can also set `check_constraints_at_runtime=false`.
Stacktrace:
  [1] rrule!!(::Mooncake.CoDual{typeof(throw), Mooncake.NoFData}, args::Mooncake.CoDual{DomainError, Mooncake.FData{…}})
    @ Mooncake ~/.julia/packages/Mooncake/oZKdl/src/rules/builtins.jl:1031
  [2] accumulate_assume!!
    @ ~/ppl/lib/src/optimisation/Optimisation.jl:166 [inlined]
[...]




julia> maximum_a_posteriori(f(), lb=(; y=2.0); adtype=AutoMooncake(), link=false)
ModeResult
  ├ estimator : MAP
  ├ lp        : -0.9741722044604852
  ├ params    : OrderedDict{AbstractPPL.VarName, Any} with 2 entries
  │             ├ x => 3.5250762576156616
  │             └ y => 3.5250762576156616
  └ linked    : false

@codecov
Copy link

codecov bot commented Jan 8, 2026

Codecov Report

❌ Patch coverage is 78.47534% with 48 lines in your changes missing coverage. Please review.
✅ Project coverage is 84.88%. Comparing base (31256e3) to head (9ae9db6).
⚠️ Report is 17 commits behind head on breaking.

Files with missing lines Patch % Lines
src/optimisation/init.jl 74.16% 31 Missing ⚠️
src/optimisation/Optimisation.jl 74.60% 16 Missing ⚠️
src/optimisation/stats.jl 96.15% 1 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##           breaking    #2747      +/-   ##
============================================
- Coverage     87.80%   84.88%   -2.92%     
============================================
  Files            21       23       +2     
  Lines          1451     1383      -68     
============================================
- Hits           1274     1174     -100     
- Misses          177      209      +32     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

using AbstractPPL: VarName
using DynamicPPL: DynamicPPL

# This function is shared by both MCMC and optimisation, so has to exist outside of both.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is copied over wholesale from the mcmc folder

@penelopeysm penelopeysm marked this pull request as ready for review January 8, 2026 18:26
@penelopeysm penelopeysm force-pushed the py/opt branch 6 times, most recently from b38bff1 to fc52126 Compare January 8, 2026 19:07
@penelopeysm penelopeysm requested a review from mhauru January 8, 2026 20:21
Copy link
Member

@mhauru mhauru left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not my most thorough review, but I read the implementation and skimmed the tests. I probably won't be here to approve, but I'm happy to merge once my comments that I left now are addressed or dismissed.

)
satisfies_lb = lb === nothing || all(proposed_val .>= lb)
satisfies_ub = ub === nothing || all(proposed_val .<= ub)
return any(isnan, proposed_val) || (satisfies_lb && satisfies_ub)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it okay that this is true even if one element flat out fails the constraint if another element isnan?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh, good catch.

Get the default initialization strategy for a given sampler `spl`, i.e. how initial
parameters for sampling are chosen if not specified by the user. By default, this is
`InitFromPrior()`, which samples initial parameters from the prior distribution.
"""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this PR have a HISTORY.md entry?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No. I mean, yes.

Comment on lines +156 to +160
lb::NTOrVNDict
ub::NTOrVNDict
init_vecs::Dict{VarName,AbstractVector}
lb_vecs::Dict{VarName,AbstractVector}
ub_vecs::Dict{VarName,AbstractVector}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could lb, ub, lb_vecs, and ub_vecs get docstrings? The first two are the untransformed constraints, the last two the transformed ones?

return ConstraintAccumulator(acc.link, acc.lb, acc.ub)
end
function Base.copy(acc::ConstraintAccumulator)
return ConstraintAccumulator(acc.link, deepcopy(acc.lb), deepcopy(acc.ub))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does a normal copy not suffice? deepcopy is kinda evil. Also, should this not copy the other fields too?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

acc.lb and acc.ub are dictionaries of varname -> values, and without deepcopy it's not ok

julia> d = Dict(:x => [1, 2])
Dict{Symbol, Vector{Int64}} with 1 entry:
  :x => [1, 2]

julia> d2 = copy(d)
Dict{Symbol, Vector{Int64}} with 1 entry:
  :x => [1, 2]

julia> d2[:x][2] = 3
3

julia> d
Dict{Symbol, Vector{Int64}} with 1 entry:
  :x => [1, 3]

Copy link
Member Author

@penelopeysm penelopeysm Jan 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Technically, it should not be a mutable dictionary. There's no reason why the dict would ever be mutated. The one reason why it could be mutated, is if it references an array that a user wrote at top-level, and then the user later mutates that array. I think we can get around this by putting the deepcopy inside the constructor, and then Base.copy can just alias the dict.

I think I looked into immutable dicts, but the issue is you have mutability in the values themselves, so even that wouldn't help.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Conceptually it's just easiest to deepcopy here though.

even if the optimisation was run in linked space."
params::P
"The final log likelihood or log joint, depending on whether `MAP` or `MLE` was run.
Note that this is a *positive* log probability."
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it positive, or is it just the actual log probability rather than its negation? It's only defined up to an additive constant anyway.

Comment on lines +235 to +241
- `lb::Union{NamedTuple,AbstractDict{<:VarName,<:Any}}=(;)`: a mapping from variable names
to lower bounds for the optimisation. The bounds should be provided in the original
(unlinked) space.
- `ub::Union{NamedTuple,AbstractDict{<:VarName,<:Any}}=(;)`: a mapping from variable names
to upper bounds for the optimisation. The bounds should be provided in the original
(unlinked) space.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might this docstring warrant having a paragraph about how constraints work for distributions with different types of supports and linking.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I was going to write it in the proper Turing docs itself, but yeah, it will all have to be documented.

- `check_model::Bool=true`: if true, the model is checked for potential errors before
optimisation begins.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

check_constraints_at_runime is missing.

Bijectors = "76274a88-744f-5084-9051-94815aaf08c4"
Clustering = "aaaa29a8-35af-508c-8bc3-b662a17a0fe5"
Combinatorics = "861a8166-3701-5b0c-9a16-15d98fcdc6aa"
DifferentiationInterface = "a0c0ee7d-e4b9-4e03-894e-1c5f64a51d63"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a compat bound?

Comment on lines +38 to +42
# TODO(penelopeysm): This assumes that tovec and varname_leaves return the same
# number of sub-elements in the same order --- which is NOT true for things like
# Cholesky factors --- so this is a bug!!!!!! We COULD use varname_and_value_leaves
# but that would just be a different kind of bug because LDF uses tovec for
# vectorisation!!!!!!
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you had an issue about this mismatch, maybe link to it here.

Find the mode of the probability distribution of a model.
Under the hood this function calls `Optimization.solve`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe worth mentioning somewhere in the docstring that we only support box constraints, even though Optimization can do more general constraints.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

3 participants