Skip to content

Commit 4d951e2

Browse files
based on some observed uses, add a section for lifelines.utils in quickstart
1 parent b5c79dc commit 4d951e2

File tree

2 files changed

+34
-5
lines changed

2 files changed

+34
-5
lines changed

README.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,8 @@ But outside of medicine and actuarial science, there are many other interesting
1212
lesser-known technique, for example:
1313
- SaaS providers are interested in measuring customer lifetimes, or time to first behaviours
1414
- sociologists are interested in measuring political parties' lifetimes, or relationships, or marriages
15-
- businesses are interested in what variables affect lifetime value
15+
- analysing [Godwin's law](https://raw.githubusercontent.com/lukashalim/GODWIN/master/Kaplan-Meier-Godwin.png) in Reddit comments
16+
- A/B tests to determine how long it takes different groups to perform an action.
1617

1718
*lifelines* is a pure Python implementation of the best parts of survival analysis. We'd love to hear if you are using *lifelines*, please leave an Issue and let us know your thoughts on the library.
1819

docs/Quickstart.rst

Lines changed: 32 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -24,7 +24,17 @@ Let's start by importing some data. We need the durations that individuals are o
2424
.. code:: python
2525
2626
from lifelines.datasets import load_waltons
27-
df = load_waltons() # returns a pandas DataFrame
27+
df = load_waltons() # returns a Pandas DataFrame
28+
29+
print df.head()
30+
"""
31+
T E group
32+
0 6 1 miR-137
33+
1 13 1 miR-137
34+
2 13 1 miR-137
35+
3 13 1 miR-137
36+
4 19 1 miR-137
37+
"""
2838
2939
T = df['T']
3040
E = df['E']
@@ -56,12 +66,12 @@ Multiple groups
5666
.. code:: python
5767
5868
groups = df['group']
59-
ix = (groups == 'control')
69+
ix = (groups == 'miR-137')
6070
61-
kmf.fit(T[ix], E[ix], label='control')
71+
kmf.fit(T[~ix], E[~ix], label='control')
6272
ax = kmf.plot()
6373
64-
kmf.fit(T[~ix], E[~ix], label='miR-137')
74+
kmf.fit(T[ix], E[ix], label='miR-137')
6575
kmf.plot(ax=ax)
6676
6777
.. image:: images/quickstart_multi.png
@@ -79,6 +89,24 @@ but instead of a ``survival_function_`` being exposed, a ``cumulative_hazard_``
7989

8090
.. note:: Similar to Scikit-Learn, all statistically estimated quanities append an underscore to the property name.
8191

92+
Getting Data in The Right Format
93+
---------------------------------
94+
95+
Often you'll have data that looks like:
96+
97+
*start_time*, *end_time*
98+
99+
Lifelines has some utility functions to transform this dataset into durations and censorships:
100+
101+
.. code:: python
102+
103+
from lifelines.utils import datetimes_to_durations
104+
105+
# start_times is a vector of datetime objects
106+
# end_times is a vector of (possibly missing) datetime objects.
107+
T, C = datetimes_to_durations(start_times, end_times, freq='h')
108+
109+
82110
Survival Regression
83111
---------------------------------
84112

0 commit comments

Comments
 (0)