-
Notifications
You must be signed in to change notification settings - Fork 852
Open
Labels
bugSomething isn't workingSomething isn't working
Description
Describe the bug
Hi Team!
I used get_cumlift(), and got the lift for S-Learner like this:

When I tried to duplicate the result, calculating it manually, the result is different from what I had using get_cumlift().
sorted_df = df_try.sort_values(col, ascending=False).reset_index(drop=True)
sorted_df.index = sorted_df.index + 1
sorted_df["cumsum_tr"] = sorted_df['w'].cumsum()
sorted_df["cumsum_ct"] = sorted_df.index.values - sorted_df["cumsum_tr"]
sorted_df["cumsum_y_tr"] = (sorted_df['y'] * sorted_df['w']).cumsum()
sorted_df["cumsum_y_ct"] = (sorted_df['y'] * (1 - sorted_df['w'])).cumsum()
And then I calculate the lift:
lift=[]
lift.append(sorted_df["cumsum_y_tr"] / sorted_df["cumsum_tr"] - sorted_df["cumsum_y_ct"] / sorted_df["cumsum_ct"])
lift = pd.concat(lift, join="inner", axis=1)
lift.loc[0] = np.zeros((lift.shape[1],))
lift = lift.sort_index().interpolate()
This is how the final result looks like:

I plot the difference between the result from get_cumlif() and manual calculation.

Does anyone know why they are different?
Environment (please complete the following information):
- OS: Windows
- Python Version: 3.8
- Versions of Major Dependencies (
pandas,scikit-learn,cython):pandas==1.3.5,scikit-learn==1.0.2,cython==0.29.34]
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working
