Skip to content

v1.1.2

Latest

Choose a tag to compare

@github-actions github-actions released this 26 Sep 12:23
· 40 commits to main since this release

Changes:

Highlights

  • Different smooth functions can share the same penalty. See commit 38e43da
  • Added the tensor smooth basis and penalty construction method of Wood, S. N. (2006; https://doi.org/10.1111/j.1541-0420.2006.00574.x). See commit 588c991
  • Improved model selection behavior for general smooth models for which coefficients have been deemed unidentifiable during estimation. See commit 2c51e69
  • Rework of the random slope class rs. This class now behaves exactly like s(fact, cov, bs='re',by=fact2) in mgcv. See commit 7a14309

api changes:

  • The list of model matrices passed to the llk/grad/hess functions of the GSMMFamily now contains None for matrices excluded from building (via the build_mat argument of the GSMM.fit() method). Previously, only matrices marked for building were passed along, but keeping track of their indices was annoying in practice.
  • Extra penalties and a callback function can now be passed to GSMM.fit(). See commit 7bed558.
  • Different smooth functions in the models of different parameters can now share the same smoothing penalty parameter. This required reworking the behavior of the id key-word, the changes have been documented in the doc strings of the f, ri, and rs classes that support this keyword. A simple example is presented below, in which the same smoothing penalty is estimated for the functions of "x0" and "x2", included in the models of the mean and scale parameter of the Gamma distribution respectively:
     sim_dat = sim12(5000, c=0, seed=0, family=GAMMALS([LOG(), LOG()]),
     	n_ranef=20)
    
     sim_formula_m = Formula(lhs("y"),
     	[i(), f(["x0"], id=1), f(["x1"]), fs(["x0"], rf="x4")],
     	data=sim_dat)
     
     sim_formula_scale = Formula(lhs("y"),
     	[i(), f(["x2"], id=1), f(["x3"])],
     	data=sim_dat)
     
     family = GAMMALS([LOG(), LOGb(-0.0001)])
     gsmm_fam = GAMLSSGSMMFamily(2, family)
     model = GSMM([sim_formula_m, sim_formula_scale], gsmm_fam)
     model.fit()
    
  • I have implemented the tensor smooth basis and penalty construction method of Wood, S. N. (2006). This results in much better estimates in simulation studies, especially when relying on the L-qEFS update to estimate smoothing penalty parameters. This required some changes and additions to the keywords passed to the f class constructor (specifically the rp key-word and the new scale_te keyword). The new construction is not currently the new default, but this might change with the next release - pending more extensive simulations. The example below shows how to enable the new construction:
     sim_dat = sim15(500, 2, c=0, seed=0, family=Gamma())
     
     sim_formula_m = Formula(lhs("y"),
     	[i(),f(["x1", "x2"],te=True,rp=2,scale_te=True)],
     	data=sim_dat)
     	
     sim_formula_scale = Formula(lhs("y"), [i()], data=sim_dat)
     
     family = GAMMALS([LOG(), LOGb(-0.0001)])
     gsmm_fam = GAMLSSGSMMFamily(2, family)
     model = GSMM([sim_formula_m, sim_formula_scale], gsmm_fam)
     model.fit(method='qEFS')
    
  • The rs class has been reworked completely. It now works exactly like random effects in mgcv, which is much more flexible than my original implementation. The changes have been documented in the docstring of the rs class. But to summarise:
    • s(fact,bs='re') in mgcv is rs(["fact"]) in mssm
    • s(cov,bs='re') in mgcv is rs(["cov"]) in mssm
    • s(fact,cov,bs='re') in mgcv is rs(["fact","cov"]) in mssm
    • s(fact,cov,bs='re',by=fact2) in mgcv is rs(["fact","cov"],by="fact2") in mssm

Functionality changes:

  • The model selection code for more general models now works (better) for models for which coefficients were deemed unidentifiable during estimation

Bug fixes

  • Fixed a bug in the computation of the expected partial derivatives of the GAMMALS distribution that resulted in poor scaling. See commit b64b534
  • Fixed a bug in how higher-order interactions of linear terms are coded when relying on the l class and li function. See commit b9133d9
  • Fixed a bug that caused segmentation faults on some systems and for some models. See commit 6e7e5b9

Efficiency

  • Model selection code can now rely on the multiprocess package to parallelize computations for more general models. This requires installing mssm with the extra [mp] dependency

Convenience updates

  • All python code has been re-formatted to the "black" format and is now linted by Flake8
  • The documentation and tutorial have been updated to reflect the changes introduced by this release