- Adds support for h2o
- Adds support for keras
- Fixes problem with FeatureImp that caused unused features to get non-zero importances
- Removes the
run
parameter from all interpretation methods. - Adds class
FeatureEffects
which wrapsFeatureEffect
and allows to compute feature effects for all features of a model with one call. - Add column ".type" to
$result
data.frame ofFeatureEffect
whenmethod="ale"
and the feature is categorical - Adds parameter
ylim
toFeatureEffect$plot
to manually set the limits of the y-axis for feature effect plots with one feature. - Adds
predict
method to FeatureEffect, which predicts the marginal effect for data instances.
- Fix vignette titles
- Some bigger changes in the feature importance class
FeatureImp
:- The
method
argument was removed, only shuffling is now possible. This means the cartesian product of all data points with all data points is not an option any longer. It was never really practical to use, except for toy examples. - The importance plot shows the name of the loss function in the x-axis label.
- The importance plot shows the quantiles of importance over the different repetitions.
- Default number of repetitions increased to 5.
- The
- Fixes problems with missing centering of ALE plots when using multiclass
- Automatically extracts data and target from the model when possible (based on the
prediction::find_data
function). Data extraction doesn't work with mlr, but target extraction does. - Feature importance (
FeatureImp
) automatically returned the ratio of permuted model error and original model error. With 0.7.2 the user can choose between the ratio (default) and the difference.
- Fixes problems with wrong computation of feature importance, features effects and so on for xgboost models.
- The
Partial
class is deprecated and will be removed in future versions. You should useFeatureEffect
now. Its usage is similar toPartial
but theaggregation
andice
argument are now combined in the newmethod
argument, where you can choose between 'ale', 'pdp', 'ice', 'pdp+ice'. - Introduced ALE plots into the
FeatureEffect
class (method='ale'
). They are now the default instead of PDPs, because they are faster and unbiased. - Plot for categorical features in PDP changed. Now showing bar plots instead of boxplots when
method='pdp'
- Removed losses: f1, logLoss, rmse, mdae, rae, rmse, rmsle, rse, rrse f1 because the implementation used didn't make sense anyways
- Interaction: The results return as interaction strength now the H-statistic instead of the H-squared-statistic. This makes it more coherent with the gbm pacakge and the interact.gbm function and with what Friedman uses in the plots in the paper. For users of the package this means that an interaction of strength x becomes an interaction of strength sqrt(x).
Interaction
,FeatureImp
andPartial
are now computed batch-wise in the background. This prevents this methods from overloading the memory. For that, thePredictor
has a new init argument 'batch.size' which limits the number of rows send to the model for prediction for the methodsInteraction
,FeatureImp
andPartial
.Interaction
andFeatureImp
additionally allow parallel computation on multiple cores. Seevignette("parallel", package = "iml")
for how to use it.
- The
Predictor
can be initialized with atype
(e.g.type = "prob"
), which is more convenient than writing a custompredict.fun
. For caret classification models, the default is now to return the response, so make sure to initialize thePredictor
withtype = "prob"
for fine-grained results. - It's easier to use classifier that output class labels and no probabilities. No warning will be issued anymore. Internally, the class labels are treated as probabilities (one column per class), where the probability for the predicted class is 1, for the others 0.
FeatureImp
supports then.repetitions
parameter which controls the number of repetitions of the feature shuffling.
- Implemented Interaction measure
- Removed
feature.index
variable fromPartial
and renamed.class.name
column in results to.class
.
object$run()
does not returnself
any longer. This means usingobject$set.feature()
for example does not automatically print the object summary any longer.- Added an introductory vignette.
- Fixed an issue where the Predictor would not store X, when y is given as character.
- The column names of the data.frames with the results of the interpretation methods start with "." instead of "..". This is due to a recent change in the data.table package v1.10.5 news item 18.
- Removed the deprecated classes
PartialDependence
andIce
. UsePartial
instead.
- FeatureImp$results column permutationError renamed to permutation.error
- Allow setting distance function in LocalModel
- Merge the classes Ice and PartialDependence into Partial
- The newly introduced Partial class can plot ice and pd curves, also in the same plot
- It is now possible to center partial dependence plots
- In obj$results has a new column "type" which contains either "ice" or "pdp". The column ..individual was renamed to "..id" and "y.hat" has been renamed to "..y.hat".
- Ice and PartialDependence will be deprecated starting from 0.4.x
- Adds argument and field types in the documentation
- The API has been reworked:
- User directly interacts with R6 classes (
pdp()
is nowPartialDependence$new()
). - User has to wrap the machine learning model with
Predictor$new()
. - New data points in
Shapley
andLocalModel
can be set with$explain()
. Lime
has been renamed toLocalModel
.
- User directly interacts with R6 classes (
- Plots have been improved.
- Documentation has been improved.
Initial release