Conversation
* print output * fix * reenable * add more lines to guide the eye * reorder table * print tgrad / trel as well * forgot this type
… and also `bundle_samples` (#1129) * Implement `ParamsWithStats` for `FastLDF` * Add comments * Implement `bundle_samples` for ParamsWithStats -> MCMCChains * Remove redundant comment * don't need Statistics?
* Make FastLDF the default * Add miscellaneous LogDensityProblems tests * Use `init!!` instead of `fast_evaluate!!` * Rename files, rebalance tests
…o` (#1130) * Use OnlyAccsVarInfo for many re-evaluation functions * drop `fast_` prefix * Add a changelog
…nked (#1141) * Improve type stability when all parameters are linked or unlinked * fix a merge conflict * fix enzyme gc crash (locally at least) * Fixes from review
…o a single struct, to avoid repetition) (#1238) As title. Instead of ```julia struct ConditionContext{V<:Union{NamedTuple,AbstractDict},C<:AbstractContext} values::V childcontext::C end struct FixedContext ... end # the same ``` and then defining the same methods on each of them, we now have ```julia abstract type ConditionOrFix end struct Condition <: ConditionOrFix end struct Fix <: ConditionOrFix end struct CondFixContext{CF<:ConditionOrFix,V<:VarNamedTuple,C<:AbstractContext} values::V childcontext::C end ``` and we only need to define things once. Note that the contexts themselves are internal to DPPL so the above change doesn't break anything (just requires twiddling with all the tests) and the external-facing API for conditioning and fixing is entirely preserved (the only API difference introduced by this PR is that you can use a VNT when conditioning or fixing). Closes #894 Closes #1234 Closes #1237 Not done yet as tests need to be rewritten (essentially anywhere where we tested ConditionContext and FixedContext separately, we should just unify them, apart from when we check for things like logp which obviously differ). Right now there's just a bunch of unnecessary duplication. Also need to add tests for the new deconditioning behaviour. Also should add a test inside the submodels file for that.
This PR tweaks the way we think of linked or unlinked variables. In particular, it attempts to address the question of: "should logjac be incremented for a variable `vn`?" in a principled manner, by specifying this with an `AbstractLinkStrategy`. I know, I know, another abstract type!!! To see why this is desirable, we need to rewind a bit... <sup>(I _do_ think abstract types are good: it means we're making a formal interface for all the black magic that used to be contained all inside VarInfo)</sup> ## DefaultContext, VarInfo In the traditional DynamicPPL mindset, all of this information was contained inside a VarInfo. It would have a little flag saying for every variable, "is this variable transformed or not". This is all on current `main`: https://github.com/TuringLang/DynamicPPL.jl/blob/412b2a9f264dca0e4d6634c8b5470050b84bcd1d/src/varinfo.jl#L810-L811 When calling `tilde_assume!!`, whether or not logjac was incremented depended on what `f` here is https://github.com/TuringLang/DynamicPPL.jl/blob/412b2a9f264dca0e4d6634c8b5470050b84bcd1d/src/contexts/default.jl#L29-L37 That function `from_maybe_linked_internal_transform` would basically attempt to check `is_transformed` and if so, it would treat it as linked; if not, then it would treat it as unlinked. ## InitContext Later on we introduced InitContext. One of the things we did was to say, well, `DynamicPPL.init()` can generate either a linked value or an unlinked value; and we'd calculate the logjac appropriately. This is also on current `main`. https://github.com/TuringLang/DynamicPPL.jl/blob/412b2a9f264dca0e4d6634c8b5470050b84bcd1d/src/contexts/init.jl#L317-L322 The problem with this comes when you try to combine InitContext with a VarInfo. Now they can clash, because what happens if `init()` returns a linked value, but inside the VarInfo it's unlinked? Up until now, we haven't actually run into this problem because of some happy coincidences (basically there is no code path in DPPL where both of these happen at the same time). However, if we want to fully modularise and separate all these bits of code, we need to not have two different competing sources of truth. ## This PR... **This PR (almost) completely externalises the question of whether `vn` is linked, and defers it to an `AbstractLinkStrategy`.** That is to say, if `generate_linked_value(ls::AbstractLinkStrategy, vn::VarName, tval)` returns `true`, then `vn` is treated as being in linked space, and logjac is accumulated. For example, here - `UnlinkAll()` means: treat everything as being in unlinked space - `LinkAll()` means: treat everything as being in linked space ```julia julia> using DynamicPPL, Distributions, Random julia> @model f() = x ~ LogNormal() f (generic function with 2 methods) julia> last(DynamicPPL.init!!(Xoshiro(468), f(), VarInfo(), InitFromPrior(), UnlinkAll())) VarInfo {linked=false} ├─ values │ VarNamedTuple │ └─ x => VectorValue{Vector{Float64}, DynamicPPL.UnwrapSingletonTransform{Tuple{Int64}}, Tuple{}}([1.0746648736094493], DynamicPPL.UnwrapSingletonTransform{Tuple{Int64}}((1,)), ()) └─ accs AccumulatorTuple with 3 accumulators ├─ LogPrior => LogPriorAccumulator(-0.9935400392011169) ├─ LogJacobian => LogJacobianAccumulator(0.0) └─ LogLikelihood => LogLikelihoodAccumulator(0.0) julia> last(DynamicPPL.init!!(Xoshiro(468), f(), VarInfo(), InitFromPrior(), LinkAll())) VarInfo {linked=true} ├─ values │ VarNamedTuple │ └─ x => LinkedVectorValue{Vector{Float64}, ComposedFunction{DynamicPPL.UnwrapSingletonTransform{Tuple{}}, ComposedFunction{Base.Fix1{typeof(broadcast), typeof(exp)}, DynamicPPL.ReshapeTransform{Tuple{Int64}, Tuple{}}}}, Tuple{}}([0.07200886749732066], DynamicPPL.UnwrapSingletonTransform{Tuple{}}(()) ∘ (Base.Fix1{typeof(broadcast), typeof(exp)}(broadcast, exp) ∘ DynamicPPL.ReshapeTransform{Tuple{Int64}, Tuple{}}((1,), ())), ()) └─ accs AccumulatorTuple with 3 accumulators ├─ LogPrior => LogPriorAccumulator(-0.9935400392011169) ├─ LogJacobian => LogJacobianAccumulator(-0.07200886749732066) └─ LogLikelihood => LogLikelihoodAccumulator(0.0) ``` Notice that because we seeded the rng, the actual value sampled is the same; but the link strategy is what controls whether the VarInfo gets a `LinkedVectorValue` or a `VectorValue`. It also controls whether logjac is accumulated. This seems more complex! (Probably because it's a new way of thinking.) I'd like to argue though that it's (1) less complex, and (2) will lead to better performance in some cases. **Composability** Firstly, the main purpose of this is to start shifting bits of information out from VarInfo. VarInfo is this struct that essentially contains all of the state when running a model. When you run a model, you check the state and say, is the state linked? If the state is linked, then we modify the state in this particular way. It's a much more imperative way of looking at model evaluation. In contrast, taking this info out of VarInfo and making it something that's specified upfront at the beginning of evaluation means that we have a more declarative way of evaluating models. We can correctly communicate things like: "I want to use these variable values (`InitFromParams`), but I also want the resulting logp and varinfo to contain linked values (`LinkAll`)" ==> `vi = DynamicPPL.init!!(model, VarInfo(), InitFromParams(p), LinkAll())`. Previously, it had to be: "I want to use these variable values" ==> `vi = DynamicPPL.init!!(model, VarInfo(), InitFromParams(p))` but there was no way to say that those parameters were also linked. So you had to do an extra linking step after doing that. **Samplers** This actually has an impact on Turing's Gibbs sampler. When a subsampler runs a model, it has to use values that are provided by other subsamplers, but it has to operate in its own space, which may be differently linked/unlinked compared to the previous sampler. Traditionally, this has been accomplished using a `match_linking!!` function https://github.com/TuringLang/Turing.jl/blob/90e367c99636a025643bf6a4f14a3056f73b00f6/src/mcmc/gibbs.jl#L538-L573. Essentially before handing over to the next subsampler, we had to make sure that the VarInfo was linked according to what it expected. This has so far not really been an issue, but from DPPL 0.40 onwards match_linking requires an extra model evaluation and for Gibbs that would quite quickly add up! By separating the question of "how are my input values provided" and "should I calculate logjac", this means that you don't have to do a match_linking step at all: each subsampler is just responsible for remembering how it calculates logjac, and you can just pass values around, you don't even need to worry about whether the values are linked or unlinked. (That's the idea, at least; Gibbs needs a fair bit more surgery before it can do that.) ## Benchmarks I was dreading running the benchmarks, but there's basically zero difference in performance compared to breaking, which is honestly a huge relief. <details><summary>Benchmark results</summary> ``` Smorgasbord main breaking py/linkstep Constructor => 4.694 ms 9.292 µs 9.208 µs keys => 411.067 ns 652.810 ns 633.300 ns subset => 496.209 µs 197.876 µs 192.458 µs merge => 15.584 µs 291.660 ns 286.197 ns evaluate!! InitFromPrior => 12.708 µs 2.754 µs 2.688 µs evaluate!! DefaultContext => 11.094 µs 1.080 µs 1.066 µs unflatten!! => 180.970 ns 1.014 µs 883.789 ns link!! => 151.937 µs 31.792 µs 31.834 µs evaluate!! InitFromPrior, linked => 29.375 µs 6.823 µs 6.469 µs evaluate!! DefaultContext, linked => 9.208 µs 2.644 µs 2.583 µs unflatten!!, linked => 188.691 ns 963.235 ns 931.353 ns Loop univariate 1k main breaking py/linkstep Constructor => 793.780 ms 173.958 µs 172.125 µs keys => 649.529 ns 7.344 µs 7.406 µs subset => 234.584 µs 18.175 ms 17.824 ms merge => 311.084 µs 2.986 µs 2.910 µs evaluate!! InitFromPrior => 58.709 µs 22.375 µs 22.750 µs evaluate!! DefaultContext => 58.083 µs 7.292 µs 7.063 µs unflatten!! => 821.429 ns 6.469 µs 8.396 µs link!! => 229.250 µs 1.035 ms 1.013 ms evaluate!! InitFromPrior, linked => 246.459 µs 22.333 µs 22.583 µs evaluate!! DefaultContext, linked => 70.458 µs 11.459 µs 11.771 µs unflatten!!, linked => 887.235 ns 6.583 µs 6.188 µs Multivariate 1k main breaking py/linkstep Constructor => 42.667 µs 24.459 µs 24.000 µs keys => 31.569 ns 43.880 ns 42.857 ns subset => 2.153 µs 426.186 ns 413.778 ns merge => 1.976 µs 2.254 ns 2.238 ns evaluate!! InitFromPrior => 13.792 µs 12.625 µs 12.084 µs evaluate!! DefaultContext => 8.334 µs 6.417 µs 5.959 µs unflatten!! => 827.381 ns 2.638 ns 2.559 ns link!! => 48.542 µs 8.250 µs 7.625 µs evaluate!! InitFromPrior, linked => 14.250 µs 13.625 µs 13.250 µs evaluate!! DefaultContext, linked => 7.979 µs 7.188 µs 6.583 µs unflatten!!, linked => 791.679 ns 2.635 ns 2.560 ns Dynamic main breaking py/linkstep Constructor => 35.292 µs 3.306 µs 3.222 µs keys => 48.659 ns 46.282 ns 46.114 ns subset => 11.709 µs 2.871 µs 2.800 µs merge => 1.566 µs 4.118 ns 3.975 ns evaluate!! InitFromPrior => 3.375 µs 2.264 µs 2.326 µs evaluate!! DefaultContext => 1.234 µs 878.788 ns 861.091 ns unflatten!! => 114.380 ns 5.768 ns 5.537 ns link!! => 149.000 µs 3.307 µs 3.385 µs evaluate!! InitFromPrior, linked => 6.448 µs 4.500 µs 3.469 µs evaluate!! DefaultContext, linked => 2.608 µs 2.003 µs 1.925 µs unflatten!!, linked => 116.129 ns 5.397 ns 5.163 ns Parent main breaking py/linkstep Constructor => 12.021 µs 386.364 ns 377.114 ns keys => 31.518 ns 43.715 ns 42.645 ns subset => 715.625 ns 35.938 ns 37.096 ns merge => 484.717 ns 2.305 ns 2.239 ns evaluate!! InitFromPrior => 95.930 ns 30.588 ns 30.078 ns evaluate!! DefaultContext => 100.410 ns 3.327 ns 3.226 ns unflatten!! => 41.185 ns 2.641 ns 2.556 ns link!! => 49.208 µs 93.254 ns 81.717 ns evaluate!! InitFromPrior, linked => 297.680 ns 33.088 ns 32.052 ns evaluate!! DefaultContext, linked => 119.073 ns 11.435 ns 11.029 ns unflatten!!, linked => 41.075 ns 2.638 ns 2.557 ns LDA main breaking py/linkstep Constructor => 122.959 µs 19.458 µs 19.375 µs keys => 117.287 ns 134.014 ns 132.316 ns subset => 57.083 µs 3.042 µs 2.981 µs merge => 2.504 µs 166.436 ns 164.959 ns evaluate!! InitFromPrior => 9.403 µs 7.583 µs 7.514 µs evaluate!! DefaultContext => 8.292 µs 6.438 µs 6.261 µs unflatten!! => 127.765 ns 511.500 ns 499.305 ns link!! => 150.562 µs 11.834 µs 11.855 µs evaluate!! InitFromPrior, linked => 11.375 µs 8.139 µs 7.625 µs evaluate!! DefaultContext, linked => 7.986 µs 6.771 µs 6.781 µs unflatten!!, linked => 123.391 ns 515.086 ns 495.763 ns ``` </details>
…ting a GrowableArray (#1253) Closes #1241 Closes #1251 With this PR: ```julia julia> using DynamicPPL julia> vnt = @vnt begin x[1] := 1.0 end ┌ Warning: Creating a growable `Base.Array` of dimension 1 to store values. This may not match the actual type or size of the actual `AbstractArray` that will be used inside the DynamicPPL model. │ │ If this is not the type or size that you expect, please see: https://turinglang.org/docs/uri/growablearray └ @ DynamicPPL.VarNamedTuples ~/ppl/dppl/src/varnamedtuple/partial_array.jl:823 VarNamedTuple └─ x => PartialArray size=(1,) data::DynamicPPL.VarNamedTuples.GrowableArray{Float64, 1} └─ (1,) => 1.0 julia> vnt[@varname(x[end])] ┌ Warning: Returning a `Base.Array` with a presumed size based on the indices used to set values; but this may not be the actual type or size of the actual `AbstractArray` that was inside the DynamicPPL model. You should inspect the returned result to make sure that it has the correct value. │ │ To find out how to avoid this warning, please see: https://turinglang.org/docs/uri/growablearray └ @ DynamicPPL.VarNamedTuples ~/ppl/dppl/src/varnamedtuple/partial_array.jl:812 1.0 ``` Previously neither of these would warn.
Member
|
One minor comment: maybe rename |
…e_values!!` (#1258) `update_values!!` just replaces the values in the VarInfo with values from a NamedTuple. This is trivially replaced by `InitFromParams()`. `maybe_invlink_before_eval!!` isn't needed anymore; the intention behind it is to avoid doing transforms more than once, and the current transform strategy stuff makes sure to accomplish that. See https://turinglang.org/DynamicPPL.jl/previews/PR1164/transforms/. See also #1249 for details on how this can be extended to support what is currently StaticTransformation. Those two, along with a bunch of tests, were the only functions that used `setindex!!(::VarInfo, value, vn::VarName)`. Since they're gone, that function also doesn't need to exist. So it can go. This function has been recognised to be rather unsafe anyway, so just as well. https://github.com/TuringLang/DynamicPPL.jl/blob/27b7a6de6623620336a1f6c013446ce43ade3a6a/src/varinfo.jl#L283-L288
…API to avoid reevaluating (#1260) Closes #1167 Closes #986 This PR renames `ValuesAsInModelAccumulator` to `RawValueAccumulator`. Reasons: 1. It's shorter. 2. I think it is a nice contrast to `VectorValueAccumulator`, which stores the (possibly linked) vectorised forms (i.e., it is *essentially* the replacement for VarInfo). Maybe more importantly, it also gets rid of ```julia values_as_in_model(model, include_colon_eq, varinfo) ``` and replaces it with ```julia get_raw_values(varinfo) ``` The rationale for this is that the former, `values_as_in_model`, would *reevaluate* the model with the accumulator and then return the values in that accumulator. That's a bit wasteful, and for the most part in Turing we never did this, instead preferring to add the accumulator manually to a VarInfo, and then extract it. See e.g. https://github.com/TuringLang/Turing.jl/blob/main/src/mcmc/prior.jl#L16. The latter, `get_raw_values(varinfo)` will error if the VarInfo doesn't have a `RawValueAccumulator`. That's a bit more annoying, but the main point here is to force developers to think more about which accumulators they should use and when, instead of just calling `values_as_in_model` and accidentally triggering a reevaluation.
This PR implements what I described in #1225. ## Cholesky As I also said in that PR, this breaks MCMCChains on Cholesky variables. This is because the individual elements of a Cholesky are stored in the chain as `x.L[1,1]` and when calling ```julia DynamicPPL.templated_setindex!!(vnt, val, @varname(x.L[1,1]), template) ``` the current VNT code will attempt to create a NamedTuple for x, then a 2D GrowableArray for x.L, then stick `val` into the first element of that. I am kind of repeating myself here, but the true solution is to stop using MCMCChains. However, for the purposes of this PR, I can put in a hacky overload for `templated_setindex!!` for when `template isa LinearAlgebra.Cholesky` that will make this work in the expected way. ## Varying dimensionality This will also break models that look like this: ```julia @model function f() N ~ Poisson(2.0) x = Vector{Float64}(undef, N) for i in 1:N x[i] ~ Normal() end end ``` The reason is because the template is picked up by running the model once. If `N` is not constant, then the template for `x` may be (for example) a length-2 vector. If you then attempt to use this template on a dataset that has `x[3]`, it will error.
…rm strategy (#1264) see #1184 this PR is basically me trying to get people to stop using `evaluate!!(model, vi)` which is pretty hard to reason about. it depends on a lot of things: - the accs are taken from `vi` - the transform strategy is taken from `model.context` if it's an InitContext and otherwise it's inferred from `vi`, - the init strategy is taken from `model.context` if it's an InitContext and otherwise it's inferred from `vi` (yeah, that's quite nasty) instead of this the intention is to push people towards using `init!!([rng,] model, ::OnlyAccsVarInfo, init_strategy, transform_strategy)` because the end goal for DPPL (at least, my end goal) is to have a *single* evaluation method, whose name is not yet determined, but should take exactly those five arguments (perhaps in a different order). i'll write docs on this soon. ---- unfortunately we still need to keep the old two-argument `evaluate!!` method around in DynamicPPL because things like `init!!` are defined in terms of it. it is possible to switch it round and make this depend on `init!!`, but that's a bit harder to do since that relies on getting rid of DefaultContext.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Release 0.40