Dropping Python3.6 support.
- Fix late entry in
add_at_risk_counts.
add_at_risk_countshas a new flag to determine to use start or end-of-period at risk counts.- new column in fitter's
summarythat display the number the parameter is being compared against.
plot_lifetimes'sdurationarg has the interpretation of "relative time the subject died (since birth)", instead of the old "time observed for". These interpretations are different when there is late entry.
- adding
weightsto log rank functions
- Fix using formulas with
CoxPHFitter.score
Error in v0.26.1 deployment
t_0inlogrank_testnow will not remove data, but will instead censor all subjects that experience the event afterwards.- update
statuscolumn inlifelines.datasets.load_lungto be more standard coding: 0 is censored, 1 is event.
- Fix using formulas with
AalenAdditiveFitter.predict_cumulative_hazard - Fix using formulas with
CoxPHFitter.score
.BIC_is now present on fitted models.CoxPHFitterwith spline baseline can accept pre-computed knot locations.- Left censoring fitting in KaplanMeierFitter is now "expected". That is,
predictalways predicts the survival function (as does every other model),confidence_interval_is always the CI for the survival function (as does every other model), and so on. In summary: the API for estimates doesn't change depending on what your censoring your dataset is.
- Fixed an annoying bug where at_risk-table label's were not aligning properly when data spanned large ranges. See merging PR for details.
- Fixed a bug in
find_best_parametric_modelwhere the wrong BIC value was being computed. - Fixed regression bug when using an array as a penalizer in Cox models.
- Fix integer-valued categorical variables in regression model predictions.
- numpy > 1.20 is allowed.
- Bug fix in the elastic-net penalty for Cox models that wasn't weighting the terms correctly.
- Better appearance when using a single row to show in
add_at_risk_table.
Small bump in dependencies.
Important: we dropped Patsy as our formula framework, and adopted Formulaic. Will the latter is less mature than Patsy, we feel the core capabilities are satisfactory and it provides new opportunities.
- Parametric models with formulas are able to be serialized now.
- a
_scipy_callbackfunction is available to use in fitting algorithms.
- Adding
cumulative_hazard_at_timesto NelsonAalenFitter
- Fixed error in
CoxPHFitterwhen entry time == event time. - Fixed formulas in AFT interval censoring regression.
- Fixed
concordance_index_when no events observed - Fixed label being overwritten in ParametricUnivariate models
- Parametric Cox models can now handle left and interval censoring datasets.
- "improved" the output of
add_at_risk_countsby removing a call toplt.tight_layout()- this works better when you are callingadd_at_risk_countson multiple axes, but it is recommended you callplt.tight_layout()at the very end of your script. - Fix bug in
KaplanMeierFitter's interval censoring where max(lower bound) < min(upper bound).
check_assumptionsnow returns a list of list of axes that can be manipulated
- fixed error when using
plot_partial_effectswith categorical data in AFT models - improved warning when Hessian matrix contains NaNs.
- fixed performance regression in interval censoring fitting in parametric models
weightswasn't being applied properly in NPMLE
- New baseline estimator for Cox models:
piecewise - Performance improvements for parametric models
log_likelihood_ratio_test()andprint_summary() - Better step-size defaults for Cox model -> more robust convergence.
- fix
check_assumptionswhen using formulas.
survival_difference_at_fixed_point_in_time_testnow accepts fitters instead of raw data, meaning that you can use this function on left, right or interval censored data.
- See note on
survival_difference_at_fixed_point_in_time_testabove.
- fix
StatisticalResultprinting in notebooks - fix Python error when calling
plot_covariate_groups - fix dtype mismatches in
plot_partial_effects_on_outcome.
- Spline
CoxPHFittercan now usestrata.
- a small parameterization change of the spline
CoxPHFitter. The linear term in the spline part was moved to a newInterceptterm in thebeta_. n_baseline_knotsin the splineCoxPHFitternow refers to all knots, and not just interior knots (this was confusing to me, the author.). So add 2 ton_baseline_knotsto recover the identical model as previously.
- fix splines
CoxPHFitterwith whenpredict_hazardwas called. - fix some exception imports I missed.
- fix log-likelihood p-value in splines
CoxPHFitter
- ok actually ship the out-of-sample calibration code
- fix
labels=Falseinadd_at_risk_counts - allow for specific rows to be shown in
add_at_risk_counts - put
patsyas a proper dependency. - suppress some Pandas 1.1 warnings.
- Formulas! lifelines now supports R-like formulas in regression models. See docs here.
plot_covariate_groupnow can plot other y-values like hazards and cumulative hazards (default: survival function).CoxPHFitternow accepts late entries viaentry_col.calibration.survival_probability_calibrationnow works with out-of-sample data.print_summarynow accepts acolumnargument to filter down the displayed values. This helps with clutter in notebooks, latex, or on the terminal.add_at_risk_countsnow follows the cool new KMunicate suggestions
- With the introduction of formulas, all models can be using formulas under the hood.
- For both custom regression models or non-AFT regression models, this means that you no longer need to add a constant column to your DataFrame (instead add a
1as a formula string in theregressorsdict). You may also need to remove the T and E columns fromregressors. I've updated the models in the\examplesfolder with examples of this new model building.
- For both custom regression models or non-AFT regression models, this means that you no longer need to add a constant column to your DataFrame (instead add a
- Unfortunately, if using formulas, your model will not be able to be pickled. This is a problem with an upstream library, and I hope to have it resolved in the near future.
plot_covariate_groupshas been deprecated in favour ofplot_partial_effects_on_outcome.- The baseline in
plot_covariate_groupshas changed from the mean observation (including dummy-encoded categorical variables) to median for ordinal (including continuous) and mode for categorical. - Previously, lifelines used the label
"_intercept"to when it added a constant column in regressions. To align with Patsy, we are now using"Intercept". - In AFT models,
ancillary_dfkwarg has been renamed toancillary. This reflects the more general use of the kwarg (not always a DataFrame, but could be a boolean or string now, too). - Some column names in datasets shipped with lifelines have changed.
- The never used "lifelines.metrics" is deleted.
- With the introduction of formulas,
plot_covariate_groups(now calledplot_partial_effects_on_outcome) behaves differently for transformed variables. Users no longer need to add "derivatives" features, and encoding is done implicitly. See docs here. - all exceptions and warnings have moved to
lifelines.exceptions
- The p-value of the log-likelihood ratio test for the CoxPHFitter with splines was returning the wrong result because the degrees of freedom was incorrect.
- better
print_summarylogic in IDEs and Jupyter exports. Previously it should not be displayed. - p-values have been corrected in the
SplineFitter. Previously, the "null hypothesis" was no coefficient=0, but coefficient=0.01. This is now set to the former. - fixed NaN bug in
survival_table_from_eventswith intervals when no events would occur in a interval.
- improved algorithm choice for large DataFrames for Cox models. Should see a significant performance boost.
- fixed
utils.median_survival_timenot accepting Pandas Series.
- fixed an edge case in
KaplanMeierFitterwhere a really late entry would occur after all other population had died. - fixed
plotinBreslowFlemingtonHarrisFitter - fixed bug where using
conditional_afterandtimesinCoxPHFitter("spline")prediction methods would be ignored.
- fixed a bug where using
conditional_afterandtimesin prediction methods would result in a shape error - fixed a bug where
scorewas not able to be used in splinedCoxPHFitter - fixed a bug where some columns would not be displayed in
print_summary
- fixed a bug where
CoxPHFitterwould ignore inputedalphalevels for confidence intervals - fixed a bug where
CoxPHFitterwould fail with working withsklearn_adapter
- improved convergence of
GeneralizedGamma(Regression)Fitter.
- new spline regression model
CRCSplineFitterbased on the paper "A flexible parametric accelerated failure time model" by Michael J. Crowther, Patrick Royston, Mark Clements. - new survival probability calibration tool
lifelines.calibration.survival_probability_calibrationto help validate regression models. Based on “Graphical calibration curves and the integrated calibration index (ICI) for survival models” by P. Austin, F. Harrell, and D. van Klaveren.
- (and bug fix) scalar parameters in regression models were not being penalized by
penalizer- we now penalizing everything except intercept terms in linear relationships.
- New improvements when using splines model in CoxPHFitter - it should offer much better prediction and baseline-hazard estimation, including extrapolation and interpolation.
- Related to above: the fitted spline parameters are now available in the
.summaryand.print_summarymethods.
- fixed a bug in initialization of some interval-censoring models -> better convergence.
- Faster NPMLE for interval censored data
- New weightings available in the
logrank_test:wilcoxon,tarone-ware,peto,fleming-harrington. Thanks @sean-reed - new interval censored dataset:
lifelines.datasets.load_mice
- Cleared up some mislabeling in
plot_loglogs. Thanks @sean-reed! - tuples are now able to be used as input in univariate models.
- Non parametric interval censoring is now available, experimentally. Not all edge cases are fully checked, and some features are missing. Try it under
KaplanMeierFitter.fit_interval_censoring
find_best_parametric_modelcan handle left and interval censoring. Also allows for more fitting options.AIC_is a property on parametric models, andAIC_partial_is a property on Cox models.penalizerin all regression models can now be an array instead of a float. This enables new functionality and better control over penalization. This is similar (but not identical) topenalty.factorsin glmnet in R.- some convergence tweaks which should help recent performance regressions.
- At the cost of some performance, convergence is improved in many models.
- New
lifelines.plotting.plot_interval_censored_lifetimesfor plotting interval censored data - thanks @sean-reed!
- fixed bug where
cdf_plotandqq_plotwere not factoring in the weights correctly.
plot_lifetimesaccepts pandas Series.
- Fixed important bug in interval censoring models. Users using interval censoring are strongly advised to upgrade.
- Improved
at_risk_countsfor subplots. - More data validation checks for
CoxTimeVaryingFitter
- Improved stability of interval censoring in parametric models.
- setting a dataframe in
ancillary_dfworks for interval censoring .scoreworks for interval censored models
- new
logxkwarg in plotting curves - PH models have
compute_followup_hazard_ratiosfor simulating what the hazard ratio would be at previous times. This is useful because the final hazard ratio is some weighted average of these.
- Fixed error in HTML printer that was hiding concordance index information.
- Fixed bug when no covariates were passed into
CoxPHFitter. See #975 - Fixed error in
StatisticalResultwhere the test name was not displayed correctly. - Fixed a keyword bug in
plot_covariate_groupsfor parametric models.
- Stability improvements for GeneralizedGammaRegressionFitter and CoxPHFitter with spline estimation.
- Fixed bug with plotting hazards in NelsonAalenFitter.
This version and future versions of lifelines no longer support py35. Pandas 1.0 is fully supported, along with previous versions. Minimum Scipy has been bumped to 1.2.0.
CoxPHFitterandCoxTimeVaryingFitterhas support for an elastic net penalty, which includes L1 and L2 regression.CoxPHFitterhas new baseline survival estimation methods. Specifically,splinenow estimates the coefficients and baseline survival using splines. The traditional method,breslow, is still the default however.- Regression models have a new
scoremethod that will score your model against a dataset (ex: a testing or validation dataset). The default is to evaluate the log-likelihood, but also the concordance index can be chose. - New
MixtureCureFitterfor quickly creating univariate mixture models. - Univariate parametric models have a
plot_density,density_at_times, and propertydensity_that computes the probability density function estimates. - new dataset for interval regression involving C. Botulinum.
- new
lifelines.fitters.mixins.ProportionalHazardMixinthat implements proportional hazard checks.
- Models' prediction method that return a single array now return a Series (use to return a DataFrame). This includes
predict_median,predict_percentile,predict_expectation,predict_log_partial_hazard, and possibly others. - The penalty in Cox models is now scaled by the number of observations. This makes it invariant to changing sample sizes. This change also make the penalty magnitude behave the same as any parametric regression model.
score_on models has been renamedconcordance_index_- models'
.variance_matrix_is now a DataFrame. CoxTimeVaryingFitterno longer requires anid_col. It's optional, and some checks may be done for integrity if provided.- Significant changes to
utils.k_fold_cross_validation. - removed automatically adding
inffromPiecewiseExponentialRegressionFitter.breakpointsandPiecewiseExponentialFitter.breakpoints tie_methodwas dropped from Cox models (it was always Efron anyways...)- Mixins are moved to
lifelines.fitters.mixins find_best_parametric_modelevaluationkwarg has been changed toscoring_method.- removed
_score_andpathfrom Cox model.
- Fixed
show_censorswithKaplanMeierFitter.plot_cumulative_densitysee issue #940. - Fixed error in
"BIC"code path infind_best_parametric_model - Fixed a bug where left censoring in AFT models was not converging well
- Cox models now incorporate any penalizers in their
log_likelihood_
- fixed important error when a parametric regression model would not assign the correct labels to fitted
parameters' variances. See more here: CamDavidsonPilon#931. Users of
GeneralizedGammaRegressionFitterand any custom regression models should update their code as soon as possible.
- fixed important error when a parametric regression model would not assign the correct labels to fitted
parameters. See more here: CamDavidsonPilon#931. Users of
GeneralizedGammaRegressionFitterand any custom regression models should update their code as soon as possible.
Bug fixes for py3.5.
- New univariate model,
SplineFitter, that uses cubic splines to model the cumulative hazard. - To aid users with selecting the best parametric model, there is a new
lifelines.utils.find_best_parametric_modelfunction that will iterate through the models and return the model with the lowest AIC (by default). - custom parametric regression models can now do left and interval censoring.
- New
predict_hazardfor parametric regression models. - New lymph node cancer dataset, originally from H.F. for the German Breast Cancer Study Group (GBSG) (1994)
- fixes error thrown when converge of regression models fails.
kwargsis now used inplot_covariate_groups- fixed bug where large exponential numbers in
print_summarywere not being suppressed correctly.
- Bug fix for PyPI
StatisticalResult.print_summarysupports html output.
- fix import in
printer.py - fix html printing with Univariate models.
- new
lifelines.plotting.rmst_plotfor pretty figures of survival curves and RMSTs. - new variance calculations for
lifelines.utils.resticted_mean_survival_time - performance improvements on regression models' preprocessing. Should make datasets with high number of columns more performant.
- fixed
print_summaryfor AAF class. - fixed repr for
sklearn_adapterclasses. - fixed
conditional_afterin Cox model with strata was used.
- new
print_summaryoptionstyleto print HTML, LaTeX or ASCII output - performance improvements for
CoxPHFitter- up to 30% performance improvements for some datasets.
- fixed bug where computed statistics were not being shown in
print_summaryfor HTML output. - fixed bug where "None" was displayed in models'
__repr__ - fixed bug in
StatisticalResult.print_summary - fixed bug when using
print_summarywith left censored models. - lots of minor bug fixes.
- new
print_summaryabstraction that allows HTML printing in Jupyter notebooks! - silenced some warnings.
- The "comparison" value of some parametric univariate models wasn't standard, so the null hypothesis p-value may have been wrong. This is now fixed.
- fixed a NaN error in confidence intervals for KaplanMeierFitter
- To align values across models, the column names for the confidence intervals in parametric univariate models
summaryhave changed. - Fixed typo in
ParametricUnivariateFittername. median_has been removed in favour ofmedian_survival_time_.left_censorshipinfithas been removed in favour offit_left_censoring.
The tests were re-factored to be shipped with the package. Let me know if this causes problems.
- fixed error in plotting models with "lower" or "upper" was in the label name.
- fixed bug in plot_covariate_groups for AFT models when >1d arrays were used for values arg.
- fixed
predict_methods in AFT models whentimelinewas not specified. - fixed error in
qq_plot - fixed error when submitting a model in
qth_survival_time CoxPHFitternow displays correct columns values when changing alpha param.
- Serializing lifelines is better supported. Packages like joblib and pickle are now supported. Thanks @AbdealiJK!
conditional_afternow available inCoxPHFitter.predict_median- Suppressed some unimportant warnings.
- fixed initial_point being ignored in AFT models.
- new
ApproximationWarningto tell you if the package is making an potentially mislead approximation.
- fixed a bug in parametric prediction for interval censored data.
- realigned values in
print_summary. - fixed bug in
survival_difference_at_fixed_point_in_time_test
utils.qth_survival_timeno longer takes acdfargument - users should take the compliment (1-cdf).- Some previous
StatisticalWarningshave been replaced byApproximationWarning
conditional_afterworks forCoxPHFitterprediction models 😅
CoxPHFitter.baseline_cumulative_hazard_'s column is renamed"baseline cumulative hazard"- previously it was"baseline hazard". (Only applies if the model has no strata.)utils.dataframe_interpolate_at_timesrenamed toutils.interpolate_at_times_and_return_pandas.
- Improvements to the repr of models that takes into accounts weights.
- Better support for predicting on Pandas Series
- Fixed issue where
fit_interval_censoringwouldn't accept lists. - Fixed an issue with
AalenJohansenFitterfailing to plot confidence intervals.
_get_initial_valuein parametric univariate models is renamed_create_initial_point
- Some performance improvements to regression models.
- lifelines will avoid penalizing the intercept (aka bias) variables in regression models.
- new
utils.restricted_mean_survival_timethat approximates the RMST using numerical integration against survival functions.
KaplanMeierFitter.survival_function_'s' index is no longer given the name "timeline".
- Fixed issue where
concordance_indexwould never exit if NaNs in dataset.
- model's now expose a
log_likelihood_property. - new
conditional_afterargument onpredict_*methods that make prediction on censored subjects easier. - new
lifelines.utils.safe_expto makeexpoverflows easier to handle. - smarter initial conditions for parametric regression models.
- New regression model:
GeneralizedGammaRegressionFitter
- removed
lifelines.utils.gamma- useautograd_gammalibrary instead. - removed bottleneck as a dependency. It offered slight performance gains only in Cox models, and only a small fraction of the API was being used.
- AFT log-likelihood ratio test was not using weights correctly.
- corrected (by bumping) scipy and autograd dependencies
- convergence is improved for most models, and many
expoverflow warnings have been eliminated. - Fixed an error in the
predict_percentileofLogLogisticAFTFitter. New tests have been added around this.
- lifelines is now compatible with scipy>=1.3.0
- fixed printing error when using robust=True in regression models
GeneralizedGammaFitteris more stable, maybe.- lifelines was allowing old version of numpy (1.6), but this caused errors when using the library. The correctly numpy has been pinned (to 1.14.0+)
- New univariate model,
GeneralizedGammaFitter. This model contains many sub-models, so it is a good model to check fits. - added a warning when a time-varying dataset had instantaneous deaths.
- added a
initial_pointoption in univariate parametric fitters. initial_pointkwarg is present in parametric univariate fitters.fitevent_tableis now an attribute on all univariate fitters (if right censoring)- improvements to
lifelines.utils.gamma
- In AFT models, the column names in
confidence_intervals_has changed to include the alpha value. - In AFT models, some column names in
.summaryand.print_summaryhas changed to include the alpha value. - In AFT models, some column names in
.summaryand.print_summaryincludes confidence intervals for the exponential of the value.
- when using
censors_showin plotting functions, the censor ticks are now reactive to the estimate being shown. - fixed an overflow bug in
KaplanMeierFitterconfidence intervals - improvements in data validation for
CoxTimeVaryingFitter
- Ability to create custom parametric regression models by specifying the cumulative hazard. This enables new and extensions of AFT models.
percentile(p)method added to univariate models that solves the equationp = S(t)fort- for parametric univariate models, the
conditional_time_to_event_is now exact instead of an approximation.
- In Cox models, the attribute
hazards_has been renamed toparams_. This aligns better with the other regression models, and is more clear (what is a hazard anyways?) - In Cox models, a new
hazard_ratios_attribute is available which is the exponentiation ofparams_. - In Cox models, the column names in
confidence_intervals_has changed to include the alpha value. - In Cox models, some column names in
.summaryand.print_summaryhas changed to include the alpha value. - In Cox models, some column names in
.summaryand.print_summaryincludes confidence intervals for the exponential of the value. - Significant changes to internal AFT code.
- A change to how
fit_interceptworks in AFT models. Previously one could setfit_interceptto False and not have to setancillary_df- now one must specify a DataFrame.
- for parametric univariate models, the
conditional_time_to_event_is now exact instead of an approximation. - fixed a name error bug in
CoxTimeVaryingFitter.plot
I'm skipping 0.21.4 version because of deployment issues.
scoring_methodnow a kwarg onsklearn_adapter
- fixed an implicit import of scikit-learn. scikit-learn is an optional package.
- fixed visual bug that misaligned x-axis ticks and at-risk counts. Thanks @christopherahern!
- include in lifelines is a scikit-learn adapter so lifeline's models can be used with scikit-learn's API. See documentation here.
CoxPHFitter.plotnow accepts ahazard_ratios(boolean) parameter that will plot the hazard ratios (and CIs) instead of the log-hazard ratios.CoxPHFitter.check_assumptionsnow accepts acolumnsparameter to specify only checking a subset of columns.
covariates_from_event_matrixhandle nulls better
- New regression model:
PiecewiseExponentialRegressionFitteris available. See blog post here: https://dataorigami.net/blogs/napkin-folding/churn - Regression models have a new method
log_likelihood_ratio_testthat computes, you guessed it, the log-likelihood ratio test. Previously this was an internal API that is being exposed.
- The default behavior of the
predictmethod on non-parametric estimators (KaplanMeierFitter, etc.) has changed from (previous) linear interpolation to (new) return last value. Linear interpolation is still possible with theinterpolateflag. - removing
_compute_likelihood_ratio_teston regression models. Uselog_likelihood_ratio_testnow.
- users can provided their own start and stop column names in
add_covariate_to_timeline - PiecewiseExponentialFitter now allows numpy arrays as breakpoints
- output of
survival_table_from_eventswhen collapsing rows to intervals now removes the "aggregate" column multi-index.
- fixed bug in CoxTimeVaryingFitter when ax is provided, thanks @j-i-l!
weightsis now a optional kwarg for parametric univariate models.- all univariate and multivariate parametric models now have ability to handle left, right and interval censored data (the former two being special cases of the latter). Users can use the
fit_right_censoring(which is an alias forfit),fit_left_censoringandfit_interval_censoring. - a new interval censored dataset is available under
lifelines.datasets.load_diabetes
left_censorshipon all univariate fitters has been deprecated. Please use the new apimodel.fit_left_censoring(...).invert_y_axisinmodel.plot(...has been removed.entriesproperty in multivariate parametric models has a new Series name:entry
- lifelines was silently converting any NaNs in the event vector to True. An error is now thrown instead.
- Fixed an error that didn't let users use Numpy arrays in prediction for AFT models
- performance improvements for
print_summary.
utils.survival_events_from_tablereturns an integer weight vector as well as durations and censoring vector.- in
AalenJohansenFitter, thevarianceparameter is renamed tovariance_to align with the usual lifelines convention.
- Fixed an error in the
CoxTimeVaryingFitter's likelihood ratio test when using strata. - Fixed some plotting bugs with
AalenJohansenFitter
- left-truncation support in AFT models, using the
entry_colkwarg infit() generate_datasets.piecewise_exponential_survival_datafor generating piecewise exp. data- Faster
print_summaryfor AFT models.
- Pandas is now correctly pinned to >= 0.23.0. This was always the case, but not specified in setup.py correctly.
- Better handling for extremely large numbers in
print_summary PiecewiseExponentialFitteris available withfrom lifelines import *.
- Now
cumulative_density_&survival_function_are always present on a fittedKaplanMeierFitter. - New attributes/methods on
KaplanMeierFitter:plot_cumulative_density(),confidence_interval_cumulative_density_,plot_survival_functionandconfidence_interval_survival_function_.
- Left censoring is now supported in univariate parametric models:
.fit(..., left_censorship=True). Examples are in the docs. - new dataset:
lifelines.datasets.load_nh4() - Univariate parametric models now include, by default, support for the cumulative density function:
.cumulative_density_,.confidence_interval_cumulative_density_,plot_cumulative_density(),cumulative_density_at_times(t). - add a
lifelines.plotting.qq_plotfor univariate parametric models that handles censored data.
plot_lifetimesno longer reverses the order when plotting. Thanks @vpolimenov!- The
Ccolumn inload_lcddataset is renamed toE.
- fixed a naming error in
KaplanMeierFitterwhenleft_censorshipwas set to True,plot_cumulative_density_()is nowplot_cumulative_density(). - added some error handling when passing in timedeltas. Ideally, users don't pass in timedeltas, as the scale is ambiguous. However, the error message before was not obvious, so we do some conversion, warn the user, and pass it through.
qth_survival_timesfor a truncated CDF would returnnp.infif the q parameter was below the truncation limit. This should have been-np.inf
- Some performance improvements to
CoxPHFitter(about 30%). I know it may seem silly, but we are now about the same or slighty faster than the Cox model in R'ssurvivalpackage (for some testing datasets and some configurations). This is a big deal, because 1) lifelines does more error checking prior, 2) R's cox model is written in C, and we are still pure Python/NumPy, 3) R's cox model has decades of development. - suppressed unimportant warnings
- Previously, lifelines always added a 0 row to
cph.baseline_hazard_, even if there were no event at this time. This is no longer the case. A 0 will still be added if there is a duration (observed or not) at 0 occurs however.
- Starting with 0.20.0, only Python3 will be supported. Over 75% of recent installs where Py3.
- Updated minimum dependencies, specifically Matplotlib and Pandas.
- smarter initialization for AFT models which should improve convergence.
inital_betain Cox model's.fitis nowinitial_point.initial_pointis now available in AFT models andCoxTimeVaryingFitter- the DataFrame
confidence_intervals_for univariate models is transposed now (previous parameters where columns, now parameters are rows).
- Fixed a bug with plotting and
check_assumptions.
plot_covariate_groupcan accept multiple covariates to plot. This is useful for columns that have implicit correlation like polynomial features or categorical variables.- Convergence improvements for AFT models.
- remove some bad print statements in
CoxPHFitter.
- new AFT models:
LogNormalAFTFitterandLogLogisticAFTFitter. - AFT models now accept a
weights_colargument tofit. - Robust errors (sandwich errors) are now avilable in AFT models using the
robust=Truekwarg infit. - Performance increase to
print_summaryin theCoxPHFitterandCoxTimeVaryingFittermodel.
ParametricUnivariateFitters, likeWeibullFitter, have smoothed plots when plotting (vs stepped plots)
- The
ExponentialFitterlog likelihood value was incorrect - inference was correct however. - Univariate fitters are more flexiable and can allow 2-d and DataFrames as inputs.
- improved stability of
LogNormalFitter - Matplotlib for Python3 users are not longer forced to use 2.x.
- Important: we changed the parameterization of the
PiecewiseExponentialto the same asExponentialFitter(from\lambda * ttot / \lambda).
- New regression model
WeibullAFTFitterfor fitting accelerated failure time models. Docs have been added to our documentation about how to useWeibullAFTFitter(spoiler: it's API is similar to the other regression models) and how to interpret the output. CoxPHFitterperformance improvements (about 10%)CoxTimeVaryingFitterperformance improvements (about 10%)
- Important: we changed the
.hazards_and.standard_errors_on Cox models to be pandas Series (instead of Dataframes). This felt like a more natural representation of them. You may need to update your code to reflect this. See notes here: CamDavidsonPilon#636 - Important: we changed the
.confidence_intervals_on Cox models to be transposed. This felt like a more natural representation of them. You may need to update your code to reflect this. See notes here: CamDavidsonPilon#636 - Important: we changed the parameterization of the
WeibullFitterandExponentialFitterfrom\lambda * ttot / \lambda. This was for a few reasons: 1) it is a more common parameterization in literature, 2) it helps in convergence. - Important: in models where we add an intercept (currently only
AalenAdditiveModel), the name of the added column has been changed frombaselineto_intercept - Important: the meaning of
alphain all fitters has changed to be the standard interpretation of alpha in confidence intervals. That means that the default for alpha is set to 0.05 in the latest lifelines, instead of 0.95 in previous versions.
- Fixed a bug in the
_log_likelihood_property ofParametericUnivariateFittermodels. It was showing the "average" log-likelihood (i.e. scaled by 1/n) instead of the total. It now displays the total. - In model
print_summarys, correct a label erroring. Instead of "Likelihood test", it should have read "Log-likelihood test". - Fixed a bug that was too frequently rejecting the dtype of
eventcolumns. - Fixed a calculation bug in the concordance index for stratified Cox models. Thanks @airanmehr!
- Fixed some Pandas <0.24 bugs.
- some improvements to the output of
check_assumptions.show_plotsis turned toFalseby default now. It only showsrankandkmp-values now. - some performance improvements to
qth_survival_time.
- added new plotting methods to parametric univariate models:
plot_survival_function,plot_hazardandplot_cumulative_hazard. The last one is an alias forplot. - added new properties to parametric univarite models:
confidence_interval_survival_function_,confidence_interval_hazard_,confidence_interval_cumulative_hazard_. The last one is an alias forconfidence_interval_. - Fixed some overflow issues with
AalenJohansenFitter's variance calculations when using large datasets. - Fixed an edgecase in
AalenJohansenFitterthat causing some datasets with to be jittered too often. - Add a new kwarg to
AalenJohansenFitter,calculate_variancethat can be used to turn off variance calculations since this can take a long time for large datasets. Thanks @pzivich!
- fixed confidence intervals in cumulative hazards for parametric univarite models. They were previously serverly depressed.
- adding left-truncation support to parametric univarite models with the
entrykwarg in.fit
- Some performance improvements to parametric univariate models.
- Suppressing some irrelevant NumPy and autograd warnings, so lifeline warnings are more noticeable.
- Improved some warning and error messages.
- New univariate fitter
PiecewiseExponentialFitterfor creating a stepwise hazard model. See docs online. - Ability to create novel parametric univariate models using the new
ParametericUnivariateFittersuper class. See docs online for how to do this. - Unfortunately, parametric univariate fitters are not serializable with
pickle. The librarydillis still useable. - Complete overhaul of all internals for parametric univariate fitters. Moved them all (most) to use
autograd. LogNormalFitterno longer modelslog_sigma.
- bug fixes in
LogNormalFittervariance estimates - improve convergence of
LogNormalFitter. We now model the log of sigma internally, but still expose sigma externally. - use the
autogradlib to help with gradients. - New
LogLogisticFitterunivariate fitter available.
LogNormalFitteris a new univariate fitter you can use.WeibullFitternow correctly returns the confidence intervals (previously returned only NaNs)WeibullFitter.print_summary()displays p-values associated with its parameters not equal to 1.0 - previously this was (implicitly) comparing against 0, which is trivially always true (the parameters must be greater than 0)ExponentialFitter.print_summary()displays p-values associated with its parameters not equal to 1.0 - previously this was (implicitly) comparing against 0, which is trivially always true (the parameters must be greater than 0)ExponentialFitter.plotnow displays the cumulative hazard, instead of the survival function. This is to make it easier to compare toWeibullFitterandLogNormalFitter- Univariate fitters'
cumulative_hazard_at_times,hazard_at_times,survival_function_at_timesreturn pandas Series now (use to be numpy arrays) - remove
alphakeyword from all statistical functions. This was never being used. - Gone are astericks and dots in
print_summaryfunctions that represent signficance thresholds. - In models'
summary(includingprint_summary), thelog(p)term has changed to-log2(p). This is known as the s-value. See https://lesslikely.com/statistics/s-values/ - introduce new statistical tests between univariate datasets:
survival_difference_at_fixed_point_in_time_test,... - new warning message when Cox models detects possible non-unique solutions to maximum likelihood.
- Generally: clean up lifelines exception handling. Ex: catch
LinAlgError: Matrix is singular.and report back to the user advice.
- more bugs in
plot_covariate_groupsfixed when using non-numeric strata.
- Fix bug in
plot_covariate_groupsthat wasn't allowing for strata to be used. - change name of
multicenter_aids_cohort_studytoload_multicenter_aids_cohort_study groupsis now calledvaluesinCoxPHFitter.plot_covariate_groups
- Fix in
compute_residualswhen usingschoenfeldand the minumum duration has only censored subjects.
- Another round of serious performance improvements for the Cox models. Up to 2x faster for CoxPHFitter and CoxTimeVaryingFitter. This was mostly the result of using NumPy's
einsumto simplify a previousforloop. The downside is the code is more esoteric now. I've added comments as necessary though 🤞
- adding bottleneck as a dependency. This library is highly-recommended by Pandas, and in lifelines we see some nice performance improvements with it too. (~15% for
CoxPHFitter) - There was a small bug in
CoxPHFitterwhen usingbatch_modethat was causing coefficients to deviate from their MLE value. This bug eluded tests, which means that it's discrepancy was less than 0.0001 difference. It's fixed now, and even more accurate tests are added. - Faster
CoxPHFitter._compute_likelihood_ratio_test() - Fixes a Pandas performance warning in
CoxTimeVaryingFitter. - Performances improvements to
CoxTimeVaryingFitter.
- corrected behaviour in
CoxPHFitterwherescore_was not being refreshed on every newfit. - Reimplentation of
AalenAdditiveFitter. There were significant changes to it:- implementation is at least 10x faster, and possibly up to 100x faster for some datasets.
- memory consumption is way down
- removed the time-varying component from
AalenAdditiveFitter. This will return in a future release. - new
print_summary weights_colis addednn_cumulative_hazardis removed (may add back)
- some plotting improvemnts to
plotting.plot_lifetimes
- More
CoxPHFitterperformance improvements. Up to a 40% reduction vs 0.16.2 for some datasets.
- Fixed
CoxTimeVaryingFitterto allow more than one variable to be stratafied - Significant performance improvements for
CoxPHFitterwith dataset has lots of duplicate times. See CamDavidsonPilon#591
- Fixed py2 division error in
concordancemethod.
- Drop Python 3.4 support.
- introduction of residual calculations in
CoxPHFitter.compute_residuals. Residuals include "schoenfeld", "score", "delta_beta", "deviance", "martingale", and "scaled_schoenfeld". - removes
estimationnamespace for fitters. Should be usingfrom lifelines import xFitternow. Thanks @usmanatron - removes
predict_log_hazard_relative_to_meanfrom Cox model. Thanks @usmanatron StatisticalResulthas be generalized to allow for multiple results (ex: from pairwise comparisons). This means a slightly changed API that is mostly backwards compatible. See doc string for how to use it.statistics.pairwise_logrank_testnow returns aStatisticalResultobject instead of a nasty NxN DataFrame 💗- Display log(p-values) as well as p-values in
print_summary. Also, p-values below thesholds will be truncated. The orignal p-values are still recoverable using.summary. - Floats
print_summaryis now displayed to 2 decimal points. This can be changed using thedecimalkwarg. - removed
standardizedfromCoxmodel plotting. It was confusing. - visual improvements to Cox models
.plot print_summarymethods accepts kwargs to also be displayed.CoxPHFitterhas a new human-readable method,check_assumptions, to check the assumptions of your Cox proportional hazard model.- A new helper util to "expand" static datasets into long-form:
lifelines.utils.to_episodic_format. CoxTimeVaryingFitternow acceptsstrata.
- bug fix for the Cox model likelihood ratio test when using non-trivial weights.
- Only allow matplotlib less than 3.0.
- API changes to
plotting.plot_lifetimes cluster_colandstratacan be used together inCoxPHFitter- removed
entryfromExponentialFitterandWeibullFitteras it was doing nothing.
- Bug fixes for v0.15.0
- Raise NotImplementedError if the
robustflag is used inCoxTimeVaryingFitter- that's not ready yet.
- adding
robustparams toCoxPHFitter'sfit. This enables atleast i) using non-integer weights in the model (these could be sampling weights like IPTW), and ii) mis-specified models (ex: non-proportional hazards). Under the hood it's a sandwich estimator. This does not handle ties, so if there are high number of ties, results may significantly differ from other software. standard_errors_is now a property on fittedCoxPHFitterwhich describes the standard errors of the coefficients.variance_matrix_is now a property on fittedCoxPHFitterwhich describes the variance matrix of the coefficients.- new criteria for convergence of
CoxPHFitterandCoxTimeVaryingFittercalled the Newton-decrement. Tests show it is as accurate (w.r.t to previous coefficients) and typically shaves off a single step, resulting in generally faster convergence. See https://www.cs.cmu.edu/~pradeepr/convexopt/Lecture_Slides/Newton_methods.pdf. Details about the Newton-decrement are added to theshow_progressstatements. - Minimum suppport for scipy is 1.0
- Convergence errors in models that use Newton-Rhapson methods now throw a
ConvergenceError, instead of aValueError(the former is a subclass of the latter, however). AalenAdditiveModelraisesConvergenceWarninginstead of printing a warning.KaplanMeierFitternow has a cumulative plot option. Examplekmf.plot(invert_y_axis=True)- a
weights_coloption has been added toCoxTimeVaryingFitterthat allows for time-varying weights. WeibullFitterhas a newshow_progressparam and additional information if the convergence fails.CoxPHFitter,ExponentialFitter,WeibullFitterandCoxTimeVaryFittermethodprint_summaryis updated with new fields.WeibullFitterhas renamed the incorrect_jacobianto_hessian_.variance_matrix_is now a property on fittedWeibullFitterwhich describes the variance matrix of the parameters.- The default
WeibullFitter().timelinehas changed from integers between the min and max duration to n floats between the max and min durations, where n is the number of observations. - Performance improvements for
CoxPHFitter(~20% faster) - Performance improvements for
CoxTimeVaryingFitter(~100% faster) - In Python3, Univariate models are now serialisable with
pickle. Thanks @dwilson1988 for the contribution. For Python2,dillis still the preferred method. baseline_cumulative_hazard_(and derivatives of that) onCoxPHFitternow correctly incorporate theweights_col.- Fixed a bug in
KaplanMeierFitterwhen late entry times lined up with death events. Thanks @pzivich - Adding
cluster_colargument toCoxPHFitterso users can specify groups of subjects/rows that may be correlated. - Shifting the "signficance codes" for p-values down an order of magnitude. (Example, p-values between 0.1 and 0.05 are not noted at all and p-values between 0.05 and 0.1 are noted with
., etc.). This deviates with how they are presented in other software. There is an argument to be made to remove p-values from lifelines altogether (become the changes you want to see in the world lol), but I worry that people could compute the p-values by hand incorrectly, a worse outcome I think. So, this is my stance. P-values between 0.1 and 0.05 offer very little information, so they are removed. There is a growing movement in statistics to shift "signficant" findings to p-values less than 0.01 anyways. - New fitter for cumulative incidence of multiple risks
AalenJohansenFitter. Thanks @pzivich! See "Methodologic Issues When Estimating Risks in Pharmacoepidemiology" for a nice overview of the model.
- fix for n > 2 groups in
multivariate_logrank_test(again). - fix bug for when
event_observedcolumn was not boolean.
- fix for n > 2 groups in
multivariate_logrank_test - fix weights in KaplanMeierFitter when using a pandas Series.
- Adds
baseline_cumulative_hazard_andbaseline_survival_toCoxTimeVaryingFitter. Because of this, new prediction methods are available. - fixed a bug in
add_covariate_to_timelinewhen usingcumulative_sumwith multiple columns. - Added
Likelihood ratio testtoCoxPHFitter.print_summaryandCoxTimeVaryingFitter.print_summary - New checks in
CoxTimeVaryingFitterthat check for immediate deaths and redundant rows. - New
delayparameter inadd_covariate_to_timeline - removed
two_sided_z_testfromstatistics
- fixes a bug when subtracting or dividing two
UnivariateFitterswith labels. - fixes an import error with using
CoxTimeVaryingFitterpredict methods. - adds a
columnargument toCoxTimeVaryingFitterandCoxPHFitterplotmethod to plot only a subset of columns.
- some quality of life improvements for working with
CoxTimeVaryingFitterincluding newpredict_methods.
- fixed bug with using weights and strata in
CoxPHFitter - fixed bug in using non-integer weights in
KaplanMeierFitter - Performance optimizations in
CoxPHFitterfor up to 40% faster completion offit.- even smarter
step_sizecalculations for iterative optimizations. - simple code optimizations & cleanup in specific hot spots.
- even smarter
- Performance optimizations in
AalenAdditiveFitterfor up to 50% faster completion offitfor large dataframes, and up to 10% faster for small dataframes.
- adding
plot_covariate_groupstoCoxPHFitterto visualize what happens to survival as we vary a covariate, all else being equal. utilsfunctions likeqth_survival_timesandmedian_survival_timesnow return the transpose of the DataFrame compared to previous version of lifelines. The reason for this is that we often treat survival curves as columns in DataFrames, and functions of the survival curve as index (ex: KaplanMeierFitter.survival_function_ returns a survival curve at time t).KaplanMeierFitter.fitandNelsonAalenFitter.fitaccept aweightsvector that can be used for pre-aggregated datasets. See this issue.- Convergence errors now return a custom
ConvergenceWarninginstead of aRuntimeWarning - New checks for complete separation in the dataset for regressions.
- removes
is_significantandtest_resultfromStatisticalResult. Users can instead choose their significance level by comparing top_value. The string representation of this class has changed aswell. CoxPHFitterandAalenAdditiveFitternow have ascore_property that is the concordance-index of the dataset to the fitted model.CoxPHFitterandAalenAdditiveFitterno longer have thedataproperty. It was an almost duplicate of the training data, but was causing the model to be very large when serialized.- Implements a new fitter
CoxTimeVaryingFitteravailable under thelifelinesnamespace. This model implements the Cox model for time-varying covariates. - Utils for creating time varying datasets available in
utils. - less noisy check for complete separation.
- removed
datasetsnamespace from the mainlifelinesnamespace CoxPHFitterhas a slightly more intelligent (barely...) way to pick a step size, so convergence should generally be faster.CoxPHFitter.fitnow has accepts aweight_colkwarg so one can pass in weights per observation. This is very useful if you have many subjects, and the space of covariates is not large. Thus you can group the same subjects together and give that observation a weight equal to the count. Altogether, this means a much faster regression.
- removes
include_likelihoodfromCoxPHFitter.fit- it was not slowing things down much (empirically), and often I wanted it for debugging (I suppose others do too). It's also another exit condition, so we many exit from the NR iterations faster. - added
step_sizeparam toCoxPHFitter.fit- the default is good, but for extremely large or small datasets this may want to be set manually. - added a warning to
CoxPHFitterto check for complete seperation: https://stats.idre.ucla.edu/other/mult-pkg/faq/general/faqwhat-is-complete-or-quasi-complete-separation-in-logisticprobit-regression-and-how-do-we-deal-with-them/ - Additional functionality to
utils.survival_table_from_eventsto bin the index to make the resulting table more readable.
- No longer support matplotlib 1.X
- Adding
timesargument toCoxPHFitter'spredict_survival_functionandpredict_cumulative_hazardto predict the estimates at, instead uses the default times of observation or censorship. - More accurate prediction methods parametrics univariate models.
- Changing liscense to valilla MIT.
- Speed up
NelsonAalenFitter.fitconsiderably.
- Python3 fix for
CoxPHFitter.plot.
- fixes regression in
KaplanMeierFitter.plotwhen using Seaborn and lifelines. - introduce a new
.plotfunction to a fittedCoxPHFitterinstance. This plots the hazard coefficients and their confidence intervals. - in all plot methods, the
ixkwarg has been deprecated in favour of a newlockwarg. This is to align with Pandas deprecatingix
- fix in internal normalization for
CoxPHFitterpredict methods.
- corrected bug that was returning the wrong baseline survival and hazard values in
CoxPHFitterwhennormalize=True. - removed
normalizekwarg inCoxPHFitter. This was causing lots of confusion for users, and added code complexity. It's really nice to be able to remove it. - correcting column name in
CoxPHFitter.baseline_survival_ CoxPHFitter.baseline_cumulative_hazard_is always centered, to mimic R'sbasehazAPI.- new
predict_log_partial_hazardstoCoxPHFitter
- adding
plot_loglogstoKaplanMeierFitter - added a (correct) check to see if some columns in a dataset will cause convergence problems.
- removing
flatargument inplotmethods. It was causing confusion. To replicate it, one can setci_force_lines=Trueandshow_censors=True. - adding
stratakeyword argument toCoxPHFitteron initialization (ex:CoxPHFitter(strata=['v1', 'v2']). Why? Fitters initialized withstratacan now be passed intok_fold_cross_validation, plus it makes unit testingstratafitters easier. - If using
stratainCoxPHFitter, access to strata specific baseline hazards and survival functions are available (previously it was a blended valie). Prediction also uses the specific baseline hazards/survivals. - performance improvements in
CoxPHFitter- should see at least a 10% speed improvement infit.
- deprecates Pandas versions before 0.18.
- throw an error if no admissable pairs in the c-index calculation. Previously a NaN was returned.
- add two summary functions to Weibull and Exponential fitter, solves #224
- new prediction function in
CoxPHFitter,predict_log_hazard_relative_to_mean, that mimics what R'spredict.coxphdoes. - removing the
predictmethod in CoxPHFitter and AalenAdditiveFitter. This is because the choice ofpredict_medianas a default was causing too much confusion, and no other natual choice as a default was available. All otherpredict_methods remain. - Default predict method in
k_fold_cross_validationis nowpredict_expectation
- supports matplotlib 1.5.
- introduction of a param
nn_cumulative_hazardsin AalenAdditiveModel's__init__(default True). This parameter will truncate all non-negative cumulative hazards in prediction methods to 0. - bug fixes including:
- fixed issue where the while loop in
_newton_rhaphsonwould break too early causing a variable not to be set properly. - scaling of smooth hazards in NelsonAalenFitter was off by a factor of 0.5.
- fixed issue where the while loop in
- reorganized lifelines directories:
- moved test files out of main directory.
- moved
utils.pyinto it's own directory. - moved all estimators
fittersdirectory.
- added a
at_riskcolumn to the output ofgroup_survival_table_from_eventsandsurvival_table_from_events - added sample size and power calculations for statistical tests. See
lifeline.statistics. sample_size_necessary_under_cphandlifelines.statistics. power_under_cph. - fixed a bug when using KaplanMeierFitter for left-censored data.
- addition of a l2
penalizertoCoxPHFitter. - dropped Fortran implementation of efficient Python version. Lifelines is pure python once again!
- addition of
stratakeyword argument toCoxPHFitterto allow for stratification of a single or set of categorical variables in your dataset. datetimes_to_durationsnow accepts a list asna_values, so multiple values can be checked.- fixed a bug in
datetimes_to_durationswherefill_datewas not properly being applied. - Changed warning in
datetimes_to_durationsto be correct. - refactor each fitter into it's own submodule. For now, the tests are still in the same file. This will also not break the API.
- allow for multiple fitters to be passed into
k_fold_cross_validation. - statistical tests in
lifelines.statistics. now return aStatisticalResultobject with properties likep_value,test_results, andsummary. - fixed a bug in how log-rank statistical tests are performed. The covariance matrix was not being correctly calculated. This resulted in slightly different p-values.
WeibullFitter,ExponentialFitter,KaplanMeierFitterandBreslowFlemingHarringtonFitterall have aconditional_time_to_event_property that measures the median duration remaining until the death event, given survival up until time t.
- addition of
median_property toWeibullFitterandExponentialFitter. WeibullFitterandExponentialFitterwill use integer timelines instead of float provided bylinspace. This is so if your work is to sum up the survival function (for expected values or something similar), it's more difficult to make a mistake.
- Inclusion of the univariate fitters
WeibullFitterandExponentialFitter. - Removing
BayesianFitterfrom lifelines. - Added new penalization scheme to AalenAdditiveFitter. You can now add a smoothing penalizer
that will try to keep subsequent values of a hazard curve close together. The penalizing coefficient
is
smoothing_penalizer. - Changed
penalizerkeyword arg tocoef_penalizerin AalenAdditiveFitter. - new
ridge_regressionfunction inutils.pyto perform linear regression with l2 penalizer terms. - Matplotlib is no longer a mandatory dependency.
.predict(time)method on univariate fitters can now accept a scalar (and returns a scalar) and an iterable (and returns a numpy array)- In
KaplanMeierFitter,epsilonhas been renamed toprecision.
- New API for
CoxPHFitterandAalenAdditiveFitter: the default arguments forevent_colandduration_col.duration_colis now mandatory, andevent_colnow accepts a column, or by default,None, which assumes all events are observed (non-censored). - Fix statistical tests.
- Allow negative durations in Fitters.
- New API in
survival_table_from_events:min_observationsis replaced bybirth_times(defaultNone). - New API in
CoxPHFitterfor summary:summarywill return a dataframe with statistics,print_summary()will print the dataframe (plus some other statistics) in a pretty manner. - Adding "At Risk" counts option to univariate fitter
plotmethods,.plot(at_risk_counts=True), and the functionlifelines.plotting.add_at_risk_counts. - Fix bug Epanechnikov kernel.
- move testing to py.test
- refactor tests into smaller files
- make
test_pairwise_logrank_test_with_identical_data_returns_inconclusivea better test - add test for summary()
- Alternate metrics can be used for
k_fold_cross_validation.
- Lots of improvements to numerical stability (but something things still need work)
- Additions to
summaryin CoxPHFitter. - Make all prediction methods output a DataFrame
- Fixes bug in 1-d input not returning in CoxPHFitter
- Lots of new tests.
- refactoring of
qth_survival_times: it can now accept an iterable (or a scalar still) of probabilities in the q argument, and will return a DataFrame with these as columns. If len(q)==1 and a single survival function is given, will return a scalar, not a DataFrame. Also some good speed improvements. - KaplanMeierFitter and NelsonAalenFitter now have a
_labelproperty that is passed in during the fit. - KaplanMeierFitter/NelsonAalenFitter's inital
alphavalue is overwritten if a newalphavalue is passed in during thefit. - New method for KaplanMeierFitter:
conditional_time_to. This returns a DataFrame of the estimate: med(S(t | T>s)) - s, human readable: the estimated time left of living, given an individual is aged s. - Adds option
include_likelihoodto CoxPHFitter fit method to save the final log-likelihood value.
- Massive speed improvements to CoxPHFitter.
- Additional prediction method:
predict_percentileis available on CoxPHFitter and AalenAdditiveFitter. Given a percentile, p, this function returns the value t such that S(t | x) = p. It is a generalization ofpredict_median. - Additional kwargs in
k_fold_cross_validationthat will accept different prediction methods (default ispredict_median). - Bug fix in CoxPHFitter
predict_expectationfunction. - Correct spelling mistake in newton-rhapson algorithm.
datasetsnow contains functions for generating the respective datasets, ex:generate_waltons_dataset.- Bumping up the number of samples in statistical tests to prevent them from failing so often (this a stop-gap)
- pep8 everything
- Ability to specify default printing in statistical tests with the
suppress_printkeyword argument (default False). - For the multivariate log rank test, the inverse step has been replaced with the generalized inverse. This seems to be what other packages use.
- Adding more robust cross validation scheme based on issue #67.
- fixing
regression_datasetindatasets.
CoxFitteris now known asCoxPHFitter- refactoring some tests that used redundant data from
lifelines.datasets. - Adding cross validation: in
utilsis a newk_fold_cross_validationfor model selection in regression problems. - Change CoxPHFitter's fit method's
display_outputtoFalse. - fixing bug in CoxPHFitter's
_compute_baseline_hazardthat errored when sending Series objects tosurvival_table_from_events. - CoxPHFitter's
fitnow looks to columns with too low variance, and halts NR algorithm if a NaN is found. - Adding a Changelog.
- more sanitizing for the statistical tests =)
CoxFitterimplements Cox Proportional Hazards model in lifelines.- lifelines moves the wheels distributions.
- tests in the
statisticsmodule now prints the summary (and still return the regular values) - new
BaseFitterclass is inherited from all fitters.