Skip to content

Conversation

@shravanngoswamii
Copy link
Member

No description provided.

@github-actions
Copy link
Contributor

Turing.jl documentation for PR #2755 is available at:
https://TuringLang.github.io/Turing.jl/previews/PR2755/

@codecov
Copy link

codecov bot commented Jan 17, 2026

Codecov Report

❌ Patch coverage is 61.44578% with 32 lines in your changes missing coverage. Please review.
✅ Project coverage is 85.09%. Comparing base (8762fa3) to head (efb745d).

Files with missing lines Patch % Lines
src/mcmc/callbacks.jl 61.44% 32 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2755      +/-   ##
==========================================
- Coverage   86.65%   85.09%   -1.56%     
==========================================
  Files          20       21       +1     
  Lines        1259     1342      +83     
==========================================
+ Hits         1091     1142      +51     
- Misses        168      200      +32     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Member

@penelopeysm penelopeysm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I can tell, the main reason for all of this code is to satisfy names_and_values, which is only used in the TensorBoard callback.

Firstly, I think the AbstractMCMC part should be changed. names_and_values should be moved into the TensorBoard extension. It's not used anywhere else in AbstractMCMC/src, and it's not meant to be overloaded. On the bright side, you did declare it internal so you can do this without calling it a breaking change in semver.

Once that's done, we "only" need to overload the following methods:

  • getparams and getstats -- okay (modulo my comment below about the return type of getparams)
  • hyperparams_metric and _hyperparams_impl -- not so okay. These functions are not exported, and are not obviously part of the AbstractMCMC API. The latter has a comment above it saying "internal". That means that if AbstractMCMC changes them in a patch or minor release, which could conceivably be semver compliant from the AbstractMCMC side, then Turing code will suddenly stop working. This is actually quite a common occurrence: just a week ago I had to fix this case where we overloaded a DynamicPPL internal function #2748.

Before this can be merged into Turing, I would at least like to see the hyperparams_... functions in AbstractMCMC be made part of a public API.

I'm less fussed about this, but I also think that it should also be made really explicit, by naming, that their existence is solely for the TensorBoard logging. That is, in src you can do:

function tb_hyperparam_metric end

and then the default definition can live in AbstractMCMCTensorBoardExt.

Comment on lines +11 to +21
function _varinfo_params(vi::DynamicPPL.AbstractVarInfo)
vns = keys(vi)
return Iterators.flatmap(vns) do vn
val = DynamicPPL.getindex_internal(vi, vn)
if val isa AbstractArray
[string(vn, "[", i, "]") => v for (i, v) in enumerate(val)]
else
[string(vn) => val]
end
end
end
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can use AbstractPPL.varname_and_value_leaves to do this sort of "split VarNames up":

iters = map(
    AbstractPPL.varname_and_value_leaves, keys(vi), values(vi),
)
vns_and_vals = mapreduce(collect, vcat, iters)

and then convert the VarNames into Strings.

But personally I think it's a mistake to do this.

Firstly, any interface that requires breaking up VarNames into sub-VarNames (individual real elements) is inherently dodgy. If you absolutely have to do so (e.g. for plotting purposes), it should be done as late as possible, i.e. immediately before you need to use it.

Secondly, the use of getindex_internal means that the results are prone to misinterpretation. VarInfo can store transformed values, that may be different from what a user thinks x represents, and getindex_internal will always return the transformed value. For example:

julia> using DynamicPPL, Distributions

julia> @model function f()
           x ~ Beta(2, 2)
           return (x, DynamicPPL.getindex_internal(__varinfo__, @varname(x)))
       end
f (generic function with 2 methods)

julia> model = f(); vi = VarInfo(model);

julia> first(DynamicPPL.evaluate!!(model, vi))
(0.5759135088785118, [0.5759135088785118])

julia> linked = DynamicPPL.link!!(vi, model);

julia> first(DynamicPPL.evaluate!!(model, linked))
(0.5759135088785118, [0.30602006830078243])

Sometimes, not only does it have a different value, it is also completely transformed. This can happen with Dirichlet (the size will be different), or things like ProductNamedTupleDistribution (x will be a NamedTuple, but getindex_internal will always return a Vector). In such cases, claiming that the variable is called x[1] etc. can be misleading.

Thirdly, shouldn't getparams be returning a vector of Reals anyway (without the names) -- see below?

Comment on lines +3 to +9
function _get_lp(vi::DynamicPPL.AbstractVarInfo)
lp = DynamicPPL.getlogp(vi)
if lp isa NamedTuple
return sum(values(lp))
end
return lp
end
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lp is always a NamedTuple in the current version of DynamicPPL.

Comment on lines +23 to +25
###
### getparams - Extract named parameters from sampler states
###
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is inconsistent with the docstring of getparams, which says

Retrieve the values of parameters from the sampler's state as a Vector{<:Real}.

https://github.com/TuringLang/AbstractMCMC.jl/blob/31db8941ab589101314345476915514d3e5b3716/src/AbstractMCMC.jl#L149-L154

Isn't it also inconsistent with what names_and_values expects here anyway?

https://github.com/TuringLang/AbstractMCMC.jl/blob/31db8941ab589101314345476915514d3e5b3716/src/callbacks.jl#L160-L161

I think overall what I would suggest is to just return a vector of parameters here and let the variable names just be theta[1], theta[2], ... Sure, it's not very informative; but (1) it can be improved upon later, and (2) under-promising is better than trying to overcommit by giving variable names that aren't necessarily correct.

If you do this, please put the implementations of getparams and getstats inside the individual samplers' files (i.e. put the methods for HMCState near the definition of the HMCState struct, etc.)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct naming is important for visualization purposes. @penelopeysm, do you think it would make sense to introduce a two-argument version of getparams, for example:

AbstractMCMC.getparams(state, Val{True}) = ... # return both names and values as Pairs 

AbstractMCMC.getparams(state, Val{False})  = AbstractMCMC.getparams(state)

This would allow callers to explicitly request parameter names when needed, while preserving the existing behavior by default.

Copy link
Member

@penelopeysm penelopeysm Jan 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I agree the interface should be extended. I find val{true} and val{false} to not be very descriptive though, and generally I prefer to have functions with different names. I had half an interface worked out for this kind of thing before -- I would suggest the following

# Default implementation, can be overloaded if needed. Sometimes it doesn't really
# make sense to provide names e.g. if you are using AdvancedHMC directly.
function getnamedparams(state)
    params = getparams(state)
    default_names = ["x[$i]" for i in 1:length(params)]
    return zip(default_names, params)
end
# Can use the model as well if needed (but in most cases you shouldn't need to since
# the state should generally contain the necessary info).
function getnamedparams(model::AbstractModel, state)
    params = getparams(model, state)
    default_names = ["x[$i]" for i in 1:length(params)]
    return zip(default_names, params)
end

# Samplers must implement either getparams(model, state) or getparams(state).
# This is already the behaviour in current AbstractMCMC, changing this would be
# breaking, so please don't change it
function getparams(model::AbstractModel, state)
    return getparams(state)
end

Downstream functions should only use the two-argument versions (which allows the sampler to overload whichever method is most relevant for them).

Copy link
Member

@yebai yebai Jan 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In that case, it may be preferable to make a breaking release so that AbstractMCMC.getparams(model::AbstractModel, state) and AbstractMCMC.getparams(state) always return both names and values. If names are missing, we can default to "θ[$i]". This would also make the interface consistent with AbstractMCMC.getstats.

However, this would likely be too much of a distraction for @shravanngoswamii at the moment, as it would require updating all existing getparams implementations across MCMC packages. As an interim solution, could we instead adopt the following convention:

AbstractMCMC.getparams(state, model::AbstractModel)  # return names and values
AbstractMCMC.getparams(state)                        # return values as a vector of floats 

We could then revisit this and update AbstractMCMC.getparamsto always return both names and values in the next breaking release.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of switching the arguments around, isn't it just cleaner to have a separate function?

This feels like an easy way to get accidental bugs by writing the order of the arguments wrongly and having it do something weird instead of throwing a MethodError.

Copy link
Member

@yebai yebai Jan 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, the most friendly option for @shravanngoswamii is probably to tweak _names_and_values and export it. For example:

names_and_values(
    model,
    sampler,
    transition,
    state;
    param = true,
    stats = false,
    hyperparams = false,
    extra = false,
)

At some point, we should unify getparams and getstats into names_and_values in some form.

Comment on lines +133 to +135
###
### getparams/getstats from transitions (ParamsWithStats)
###
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are these methods needed -- when do we need to call these functions on transitions?

Comment on lines +13 to +21
@testset "HMCState getparams" begin
Random.seed!(42)
chain = sample(gdemo_default, NUTS(100, 0.65), 10; progress=false)

rng = Random.default_rng()
transition, state = AbstractMCMC.step(
rng,
gdemo_default,
NUTS(100, 0.65);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. The call to seed! is not needed since you aren't checking the values it takes. If you don't need the RNG to be predictable, don't put it in the test suite, as otherwise it can imply that it needs to be predictable.

  2. I think you can loop over the samplers and use the same model @model f() = x ~ Normal() for all of them, to avoid the code repetition.

Comment on lines +146 to +148
###
### hyperparam_metrics - Define TensorBoard hyperparam metrics
###
Copy link
Member

@penelopeysm penelopeysm Jan 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If these overloads are only needed for TensorBoard, they can be moved into a Turing tensorboard extension, no need to keep it in src.

In general code should be given 'as little power as possible' -- if there's no need for anybody to import and use this code outside of the TensorBoard ext, then it should be in the extension.

@testset "getparams from states" begin
@testset "HMCState getparams" begin
Random.seed!(42)
chain = sample(gdemo_default, NUTS(100, 0.65), 10; progress=false)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
chain = sample(gdemo_default, NUTS(100, 0.65), 10; progress=false)

This sample call seems unused?

Comment on lines +150 to +174
function AbstractMCMC.hyperparam_metrics(model::DynamicPPL.Model, sampler::NUTS)
return [
"extras/acceptance_rate/stat/Mean",
"extras/max_hamiltonian_energy_error/stat/Mean",
"extras/lp/stat/Mean",
"extras/n_steps/stat/Mean",
"extras/tree_depth/stat/Mean",
]
end

function AbstractMCMC.hyperparam_metrics(model::DynamicPPL.Model, sampler::Hamiltonian)
return [
"extras/acceptance_rate/stat/Mean",
"extras/lp/stat/Mean",
"extras/n_steps/stat/Mean",
]
end

function AbstractMCMC.hyperparam_metrics(model::DynamicPPL.Model, sampler::MH)
return ["extras/lp/stat/Mean"]
end

function AbstractMCMC.hyperparam_metrics(model::DynamicPPL.Model, sampler::PG)
return ["extras/lp/stat/Mean", "extras/logevidence/stat/Mean"]
end
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@shravanngoswamii I suggest removing AbstractMCMC.hyperparam_metrics entirely. For this purpose, hyperparameters can be treated in the same manner as model parameters, with stats_options used to specify which statistics should be computed. This avoids maintaining a separate mechanism for hyperparameter metrics and simplifies the interface.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree, I'm happy to bundle it into stats.

### _hyperparams_impl - Extract sampler hyperparameters
###

function AbstractMCMC._hyperparams_impl(
Copy link
Member

@yebai yebai Jan 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

EDIT: See #2755 (comment)

I propose unifying AbstractMCMC._hyperparams_impl with AbstractMCMC.getstats (see the references linked above). At present, getstats is defined only for a single argument. A natural extension would be to introduce

AbstractMCMC.getstats(state, sampler)

and to omit the model argument from AbstractMCMC._hyperparams_impl, as it is not required for this functionality.

Note that some hyperparameters may be constant across MCMC steps. This is compatible with the proposed interface: such hyperparameters can simply return the same value at each step without special handling. Hyperparameters that are adapted during the warmup phase may vary across warmup iterations, but are expected to remain fixed thereafter.

Comment on lines +23 to +25
###
### getparams - Extract named parameters from sampler states
###
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct naming is important for visualization purposes. @penelopeysm, do you think it would make sense to introduce a two-argument version of getparams, for example:

AbstractMCMC.getparams(state, Val{True}) = ... # return both names and values as Pairs 

AbstractMCMC.getparams(state, Val{False})  = AbstractMCMC.getparams(state)

This would allow callers to explicitly request parameter names when needed, while preserving the existing behavior by default.

@yebai
Copy link
Member

yebai commented Jan 19, 2026

Thanks @shravanngoswamii — I’ve added a few refactoring comments above. The functionality is already in place, but it could benefit from some minor cleanup, which I agree with Penny on.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants