Skip to content

Releases: JoKra1/mssm

v1.1.2

26 Sep 12:23

Choose a tag to compare

Changes:

Highlights

  • Different smooth functions can share the same penalty. See commit 38e43da
  • Added the tensor smooth basis and penalty construction method of Wood, S. N. (2006; https://doi.org/10.1111/j.1541-0420.2006.00574.x). See commit 588c991
  • Improved model selection behavior for general smooth models for which coefficients have been deemed unidentifiable during estimation. See commit 2c51e69
  • Rework of the random slope class rs. This class now behaves exactly like s(fact, cov, bs='re',by=fact2) in mgcv. See commit 7a14309

api changes:

  • The list of model matrices passed to the llk/grad/hess functions of the GSMMFamily now contains None for matrices excluded from building (via the build_mat argument of the GSMM.fit() method). Previously, only matrices marked for building were passed along, but keeping track of their indices was annoying in practice.
  • Extra penalties and a callback function can now be passed to GSMM.fit(). See commit 7bed558.
  • Different smooth functions in the models of different parameters can now share the same smoothing penalty parameter. This required reworking the behavior of the id key-word, the changes have been documented in the doc strings of the f, ri, and rs classes that support this keyword. A simple example is presented below, in which the same smoothing penalty is estimated for the functions of "x0" and "x2", included in the models of the mean and scale parameter of the Gamma distribution respectively:
     sim_dat = sim12(5000, c=0, seed=0, family=GAMMALS([LOG(), LOG()]),
     	n_ranef=20)
    
     sim_formula_m = Formula(lhs("y"),
     	[i(), f(["x0"], id=1), f(["x1"]), fs(["x0"], rf="x4")],
     	data=sim_dat)
     
     sim_formula_scale = Formula(lhs("y"),
     	[i(), f(["x2"], id=1), f(["x3"])],
     	data=sim_dat)
     
     family = GAMMALS([LOG(), LOGb(-0.0001)])
     gsmm_fam = GAMLSSGSMMFamily(2, family)
     model = GSMM([sim_formula_m, sim_formula_scale], gsmm_fam)
     model.fit()
    
  • I have implemented the tensor smooth basis and penalty construction method of Wood, S. N. (2006). This results in much better estimates in simulation studies, especially when relying on the L-qEFS update to estimate smoothing penalty parameters. This required some changes and additions to the keywords passed to the f class constructor (specifically the rp key-word and the new scale_te keyword). The new construction is not currently the new default, but this might change with the next release - pending more extensive simulations. The example below shows how to enable the new construction:
     sim_dat = sim15(500, 2, c=0, seed=0, family=Gamma())
     
     sim_formula_m = Formula(lhs("y"),
     	[i(),f(["x1", "x2"],te=True,rp=2,scale_te=True)],
     	data=sim_dat)
     	
     sim_formula_scale = Formula(lhs("y"), [i()], data=sim_dat)
     
     family = GAMMALS([LOG(), LOGb(-0.0001)])
     gsmm_fam = GAMLSSGSMMFamily(2, family)
     model = GSMM([sim_formula_m, sim_formula_scale], gsmm_fam)
     model.fit(method='qEFS')
    
  • The rs class has been reworked completely. It now works exactly like random effects in mgcv, which is much more flexible than my original implementation. The changes have been documented in the docstring of the rs class. But to summarise:
    • s(fact,bs='re') in mgcv is rs(["fact"]) in mssm
    • s(cov,bs='re') in mgcv is rs(["cov"]) in mssm
    • s(fact,cov,bs='re') in mgcv is rs(["fact","cov"]) in mssm
    • s(fact,cov,bs='re',by=fact2) in mgcv is rs(["fact","cov"],by="fact2") in mssm

Functionality changes:

  • The model selection code for more general models now works (better) for models for which coefficients were deemed unidentifiable during estimation

Bug fixes

  • Fixed a bug in the computation of the expected partial derivatives of the GAMMALS distribution that resulted in poor scaling. See commit b64b534
  • Fixed a bug in how higher-order interactions of linear terms are coded when relying on the l class and li function. See commit b9133d9
  • Fixed a bug that caused segmentation faults on some systems and for some models. See commit 6e7e5b9

Efficiency

  • Model selection code can now rely on the multiprocess package to parallelize computations for more general models. This requires installing mssm with the extra [mp] dependency

Convenience updates

  • All python code has been re-formatted to the "black" format and is now linted by Flake8
  • The documentation and tutorial have been updated to reflect the changes introduced by this release

v1.1.1

04 Aug 15:00

Choose a tag to compare

Changes:

Bug fixes:

  • Fixed a bug in the estimation code for GAMMs, which resulted in an error in the rare case that not all Fisher weights were valid at convergence of the outer algorithm to estimate the smoothing penalty parameters.

Test cases:

  • Some more test cases for drop handling and ar1 models of the residuals.

v1.1.0

31 Jul 17:00
5b9e5cf

Choose a tag to compare

Changes:

Convenience and api changes:

  • GAMM now inherits from GAMMLSS, which inherits from GSMM. Because of this, all models now share a largely unified set of instance variables. For example model.overall_penalties now points to a list of estimated penalties for all models. Similarly, model.coef, model.preds, and model.mus correspond to the overall coefficient vector, a list holding the linear predictors, and a list holding the estimated parameters (e.g., means) for all models. Finally, models.formulas holds all formulas passed to the constructor of any model. For a GAMM this list simply always holds one formula.
  • Terms now handle model matrix and penalty building, absorb constraint, and compute info about their coefficients. This drastically simplified the Formula class. Functions to build all penalties and model matrices are now also no longer methods of the Formula class.
  • Introduced a new Penalty class, which makes it much easier for users to add their own implementations to mssm. The new Programmer's Guide section of the documentation describes how to do this and also describes how users can add their own marginal smooth basis function.
  • The .get_resid method of the GAMMLSS and GSMM classes supports key-word arguments now.
  • All fit methods as well as the compare_CDL function now receive much better defaults for all arguments and adapt to the specific model.
  • The documentation has been fully re-worked to reflect all these changes. In addition, all functions not prefixed as private now correclty specify the expected type for each argument and return value and come with descriptions of what they do. Non-internal functions come with a list of code examples.

Functionality changes:

  • p-value computation for smooth terms now relies on the method by Davies to evaluate the generalized chi-square distribution. In simulations, coverage is now nearly nominal for univariate terms and only slightly worse for tensor terms.
  • By default the Formula class now checks for nested tensor terms.
  • The GAMM class now supports "ar1" models of the residuals for both Gaussian and non-Gaussian models.
  • The compare_CDL function can now omit the computation of the second correction term by Wood, Pya, & Säfken (2016), which can drastically speed up computations.
  • Martingale and Deviance residuals have been implemented for the PropHaz family.

Efficiency

  • Model comparison code has been moved to cpp.
  • Efficiency of the L-qEFS update has been improved as well

v1.0.1

13 Jun 16:02

Choose a tag to compare

v1.0.1

v1.0.0

13 Jun 15:50

Choose a tag to compare

v1.0.0

v0.9.4

24 Feb 08:44

Choose a tag to compare

v0.9.4

v0.9.3

20 Jan 17:36

Choose a tag to compare

v0.9.3

v0.9.2

09 Jan 17:51

Choose a tag to compare

v0.9.2

v0.9.1

05 Jan 17:27

Choose a tag to compare

v0.9.1

v0.9.0

05 Jan 16:07

Choose a tag to compare

v0.9.0