TreeSHAP, libxgboost, and implications for predict function

Am looking to 'modernize' my approach and switch from partial dependence plots to Shapely plots. Shapley values are computationally demanding and would like to take advantage of the TreeSHAP algorithm that is built in to libxgboost.  This feature is accessible via the predict function by using the keyword parameter 'preds_contribs' ; [libxgboost predict options](https://xgboost.readthedocs.io/en/stable/prediction.html).

Although XGBoost.predict accepts keyword parameters, there is a limited set that is passed to libxgboost.
```
opts = Dict("type"=>(margin ? 1 : 0),
                "iteration_begin"=>ntree_lower_limit,
                "iteration_end"=>ntree_limit,
                "strict_shape"=>false,
                "training"=>training,
               ) |> JSON3.write
```

As a short term solution, I can write a personalized version to allow additional keyword parameters.  I also realize that the current approach reduces risk of breaking older code.

There are three parameters (pred_contribs, pred_interactions, and pred_leaf) that could be handy to have available.  Adding these parameters adds complexity related to the shape of data returned.  Perhaps there is a role for a separate function i.e., 'predict_shapley' that specifically handles these additional parameters -- this would be least likely to break any pre-written code.  As a new function it would be less hassle implementing 'strict_shape=true' and users can code with it in mind.  Currently multi:softmax and multi:softprob add an additional dimension and need separate coding - 'strict-shape' adds a dimension called 'group' so that all objectives return the same number of dimensions.  The TreeSHAP algorithms return additional dimension(s) and as we found with mult: models, those arrays are row major (C standard) where Julia is column major so it gets complicated reshaping 3(or 4) dimensional arrays.

Thank you for consideration.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TreeSHAP, libxgboost, and implications for predict function #169

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

TreeSHAP, libxgboost, and implications for predict function #169

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions