Add an option to order by ascending/descending prediction in cumulative effect curves

### Describe the feature and the current state.
In the causal validation module and the [curves file](https://github.com/nubank/fklearn/blob/master/src/fklearn/causal/validation/curves.py), it would be useful to add an `ascending` parameter for the cumulative effect and cumulative gain curves.

The current state is to [order predictions descending](https://github.com/nubank/fklearn/blob/ac4a3ee958392a9de7c6f941cb75997559432c94/src/fklearn/causal/validation/curves.py#L98):
```python
ordered_df = df.sort_values(prediction, ascending=False).reset_index(drop=True)
```

If we add an `ascending: bool = False` argument to the `cumulative_effect_curve`, `cumulative_gain_curve`, `relative_cumulative_gain_curve`, and `effect_curves`, a user could modify how these effects are computed, whether to do them ascending or descending by the prediction column.

### Will this change a current behavior? How?
Not if the user does not explicitly change the argument to `ascending=True`. If they do, the cumulative effect or cumulative gain curves will be computed using an ascending ordering in the prediction column.

A model could output a prediction that is not necessarily positively related to the effect to be computed, so adding an option to order this relationship differently will allow for effects and gains with negatively related predictions and outcomes to be computed adequately.

One current workaround is to do this:
```python
df["prediction"] = -df["prediction"]
```

and then the computation will be made adequately. But this seems like a hack and maybe something we want to solve more cleanly.

### Additional Information

The new definition of `cumulative_effect_curve` would look like this:

```python
@curry
def cumulative_effect_curve(df: pd.DataFrame,
                            treatment: str,
                            outcome: str,
                            prediction: str,
                            min_rows: int = 30,
                            steps: int = 100,
                            effect_fn: EffectFnType = linear_effect,
                            ascending: bool = False) -> np.ndarray:
    """
    Orders the dataset by prediction and computes the cumulative effect curve according to that ordering

    Parameters
    ----------
    df : Pandas' DataFrame
        A Pandas' DataFrame with target and prediction scores.

    treatment : Strings
        The name of the treatment column in `df`.

    outcome : Strings
        The name of the outcome column in `df`.

    prediction : Strings
        The name of the prediction column in `df`.

    min_rows : Integer
        Minimum number of observations needed to have a valid result.

    steps : Integer
        The number of cumulative steps to iterate when accumulating the effect

    effect_fn : function (df: pandas.DataFrame, treatment: str, outcome: str) -> int or Array of int
        A function that computes the treatment effect given a dataframe, the name of the treatment column and the name
        of the outcome column.

    ascending : bool
        Whether the prediction column should be ordered ascending or not. Default is False.


    Returns
    ----------
    cumulative effect curve: Numpy's Array
        The cumulative treatment effect according to the predictions ordering.
    """

    size = df.shape[0]
    ordered_df = df.sort_values(prediction, ascending=ascending).reset_index(drop=True)
    n_rows = list(range(min_rows, size, size // steps)) + [size]
    return np.array([effect_fn(ordered_df.head(rows), treatment, outcome) for rows in n_rows])
```


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add an option to order by ascending/descending prediction in cumulative effect curves #204

Describe the feature and the current state.

Will this change a current behavior? How?

Additional Information

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Add an option to order by ascending/descending prediction in cumulative effect curves #204

Description

Describe the feature and the current state.

Will this change a current behavior? How?

Additional Information

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions