Releases: JoKra1/mssm
Releases · JoKra1/mssm
v1.1.2
Changes:
Highlights
- Different smooth functions can share the same penalty. See commit 38e43da
- Added the tensor smooth basis and penalty construction method of Wood, S. N. (2006; https://doi.org/10.1111/j.1541-0420.2006.00574.x). See commit 588c991
- Improved model selection behavior for general smooth models for which coefficients have been deemed unidentifiable during estimation. See commit 2c51e69
- Rework of the random slope class
rs. This class now behaves exactly likes(fact, cov, bs='re',by=fact2)inmgcv. See commit 7a14309
api changes:
- The list of model matrices passed to the llk/grad/hess functions of the
GSMMFamilynow containsNonefor matrices excluded from building (via thebuild_matargument of theGSMM.fit()method). Previously, only matrices marked for building were passed along, but keeping track of their indices was annoying in practice. - Extra penalties and a callback function can now be passed to
GSMM.fit(). See commit 7bed558. - Different smooth functions in the models of different parameters can now share the same smoothing penalty parameter. This required reworking the behavior of the
idkey-word, the changes have been documented in the doc strings of thef,ri, andrsclasses that support this keyword. A simple example is presented below, in which the same smoothing penalty is estimated for the functions of "x0" and "x2", included in the models of the mean and scale parameter of the Gamma distribution respectively:sim_dat = sim12(5000, c=0, seed=0, family=GAMMALS([LOG(), LOG()]), n_ranef=20) sim_formula_m = Formula(lhs("y"), [i(), f(["x0"], id=1), f(["x1"]), fs(["x0"], rf="x4")], data=sim_dat) sim_formula_scale = Formula(lhs("y"), [i(), f(["x2"], id=1), f(["x3"])], data=sim_dat) family = GAMMALS([LOG(), LOGb(-0.0001)]) gsmm_fam = GAMLSSGSMMFamily(2, family) model = GSMM([sim_formula_m, sim_formula_scale], gsmm_fam) model.fit() - I have implemented the tensor smooth basis and penalty construction method of Wood, S. N. (2006). This results in much better estimates in simulation studies, especially when relying on the
L-qEFSupdate to estimate smoothing penalty parameters. This required some changes and additions to the keywords passed to thefclass constructor (specifically therpkey-word and the newscale_tekeyword). The new construction is not currently the new default, but this might change with the next release - pending more extensive simulations. The example below shows how to enable the new construction:sim_dat = sim15(500, 2, c=0, seed=0, family=Gamma()) sim_formula_m = Formula(lhs("y"), [i(),f(["x1", "x2"],te=True,rp=2,scale_te=True)], data=sim_dat) sim_formula_scale = Formula(lhs("y"), [i()], data=sim_dat) family = GAMMALS([LOG(), LOGb(-0.0001)]) gsmm_fam = GAMLSSGSMMFamily(2, family) model = GSMM([sim_formula_m, sim_formula_scale], gsmm_fam) model.fit(method='qEFS') - The
rsclass has been reworked completely. It now works exactly like random effects inmgcv, which is much more flexible than my original implementation. The changes have been documented in the docstring of thersclass. But to summarise:s(fact,bs='re')inmgcvisrs(["fact"])inmssms(cov,bs='re')inmgcvisrs(["cov"])inmssms(fact,cov,bs='re')inmgcvisrs(["fact","cov"])inmssms(fact,cov,bs='re',by=fact2)inmgcvisrs(["fact","cov"],by="fact2")inmssm
Functionality changes:
- The model selection code for more general models now works (better) for models for which coefficients were deemed unidentifiable during estimation
Bug fixes
- Fixed a bug in the computation of the expected partial derivatives of the
GAMMALSdistribution that resulted in poor scaling. See commit b64b534 - Fixed a bug in how higher-order interactions of linear terms are coded when relying on the
lclass andlifunction. See commit b9133d9 - Fixed a bug that caused segmentation faults on some systems and for some models. See commit 6e7e5b9
Efficiency
- Model selection code can now rely on the
multiprocesspackage to parallelize computations for more general models. This requires installingmssmwith the extra[mp]dependency
Convenience updates
- All python code has been re-formatted to the "black" format and is now linted by Flake8
- The documentation and tutorial have been updated to reflect the changes introduced by this release
v1.1.1
Changes:
Bug fixes:
- Fixed a bug in the estimation code for GAMMs, which resulted in an error in the rare case that not all Fisher weights were valid at convergence of the outer algorithm to estimate the smoothing penalty parameters.
Test cases:
- Some more test cases for drop handling and ar1 models of the residuals.
v1.1.0
Changes:
Convenience and api changes:
GAMMnow inherits fromGAMMLSS, which inherits fromGSMM. Because of this, all models now share a largely unified set of instance variables. For examplemodel.overall_penaltiesnow points to a list of estimated penalties for all models. Similarly,model.coef,model.preds, andmodel.muscorrespond to the overall coefficient vector, a list holding the linear predictors, and a list holding the estimated parameters (e.g., means) for all models. Finally,models.formulasholds all formulas passed to the constructor of any model. For aGAMMthis list simply always holds one formula.- Terms now handle model matrix and penalty building, absorb constraint, and compute info about their coefficients. This drastically simplified the
Formulaclass. Functions to build all penalties and model matrices are now also no longer methods of theFormulaclass. - Introduced a new
Penaltyclass, which makes it much easier for users to add their own implementations tomssm. The newProgrammer's Guidesection of the documentation describes how to do this and also describes how users can add their own marginal smooth basis function. - The
.get_residmethod of theGAMMLSSandGSMMclasses supports key-word arguments now. - All
fitmethods as well as thecompare_CDLfunction now receive much better defaults for all arguments and adapt to the specific model. - The documentation has been fully re-worked to reflect all these changes. In addition, all functions not prefixed as private now correclty specify the expected type for each argument and return value and come with descriptions of what they do. Non-internal functions come with a list of code examples.
Functionality changes:
- p-value computation for smooth terms now relies on the method by Davies to evaluate the generalized chi-square distribution. In simulations, coverage is now nearly nominal for univariate terms and only slightly worse for tensor terms.
- By default the
Formulaclass now checks for nested tensor terms. - The
GAMMclass now supports "ar1" models of the residuals for both Gaussian and non-Gaussian models. - The
compare_CDLfunction can now omit the computation of the second correction term by Wood, Pya, & Säfken (2016), which can drastically speed up computations. - Martingale and Deviance residuals have been implemented for the
PropHazfamily.
Efficiency
- Model comparison code has been moved to cpp.
- Efficiency of the L-qEFS update has been improved as well
v1.0.1
v1.0.1
v1.0.0
v1.0.0
v0.9.4
v0.9.4
v0.9.3
v0.9.3
v0.9.2
v0.9.2
v0.9.1
v0.9.1
v0.9.0
v0.9.0