counterfactuals (version 0.1.2)

Counterfactuals: Counterfactuals Class

Description

A Counterfactuals object should be created by the $find_counterfactuals method of CounterfactualMethodRegr or CounterfactualMethodClassif. It contains the counterfactuals and has several methods for their evaluation and visualization.

Arguments

Active bindings

desired

(list(1) | list(2))
A list with the desired properties of the counterfactuals. For regression tasks it has one element desired_outcome (CounterfactualMethodRegr) and for classification tasks two elements desired_class and desired_prob (CounterfactualMethodClassif).

data

(data.table)
The counterfactuals for x_interest.

x_interest

(data.table(1))
A single row with the observation of interest.

distance_function

(function())
The distance function used in the second and fourth evaluation measure. The function must have three arguments: x, y, and data and return a numeric matrix. If set to NULL (default), then Gower distance (Gower 1971) is used.

method

(character)
A single row with the observation of interest.

Methods


Method new()

Creates a new Counterfactuals object. This method should only be called by the $find_counterfactuals methods of CounterfactualMethodRegr and CounterfactualMethodClassif.

Usage

Counterfactuals$new(
  cfactuals,
  predictor,
  x_interest,
  param_set,
  desired,
  method = NULL
)

Arguments

cfactuals

(data.table)
The counterfactuals. Must have the same column names and types as predictor$data$X.

predictor

(Predictor)
The object (created with iml::Predictor$new()) holding the machine learning model and the data.

x_interest

(data.table(1) | data.frame(1))
A single row with the observation of interest.

param_set

(ParamSet)
A ParamSet based on the features of predictor$data$X.

desired

(list(1) | list(2))
A list with the desired properties of the counterfactuals. It should have one element desired_outcome for regression tasks (CounterfactualMethodRegr) and two elements desired_class and desired_prob for classification tasks (CounterfactualMethodClassif).

method

(character)
Name of the method with which counterfactuals were generated. Default is NULL which means that no name is provided.


Method evaluate()

Evaluates the counterfactuals. It returns the counterfactuals together with the evaluation measures.

Usage

Counterfactuals$evaluate(
  measures = c("dist_x_interest", "dist_target", "no_changed", "dist_train",
    "minimality"),
  show_diff = FALSE,
  k = 1L,
  weights = NULL
)

Arguments

measures

(character)
The name of one or more evaluation measures. The following measures are available:

  • dist_x_interest: The distance of a counterfactual to x_interest measured by Gower's dissimilarity measure (Gower 1971).

  • dist_target: The absolute distance of the prediction for a counterfactual to the interval desired_outcome (regression tasks) or desired_prob (classification tasks).

  • no_changed: The number of feature changes w.r.t. x_interest.

  • dist_train: The (weighted) distance to the k nearest training data points measured by Gower's dissimilarity measure (Gower 1971).

  • minimality: The number of changed features that each could be set to the value of x_interest while keeping the desired prediction value.

show_diff

(logical(1))
Should the counterfactuals be displayed as their differences to x_interest? Default is FALSE. If set to TRUE, positive values for numeric features indicate an increase compared to the feature value in x_interest, negative values indicate a decrease. For factors, the feature value is displayed if it differs from x_interest; NA means "no difference" in both cases.

k

(integerish(1))
How many nearest training points should be considered for computing the dist_train measure? Default is 1L.

weights

(numeric(k) | NULL)
How should the k nearest training points be weighted when computing the dist_train measure? If NULL (default) then all k points are weighted equally. If a numeric vector of length k is given, the i-th element specifies the weight of the i-th closest data point.


Method evaluate_set()

Evaluates a set of counterfactuals. It returns the evaluation measures.

Usage

Counterfactuals$evaluate_set(
  measures = c("diversity", "no_nondom", "frac_nondom", "hypervolume"),
  nadir = NULL
)

Arguments

measures

(character)
The name of one or more evaluation measures. The following measures are available:

  • diversity: Diversity of returned counterfactuals in the feature space

  • no_nondom: Number of counterfactuals that are not dominated by other counterfactuals.

  • frac_nondom: Fraction of counterfactuals that are not dominated by other counterfactuals

  • hypervolume: Hypervolume of the induced Pareto front

nadir

(numeric)
Max objective values to calculate dominated hypervolume. Only considered, if hypervolume is one of the measures. May be a scalar, in which case it is used for all four objectives, or a vector of length 4. Default is NULL, meaning the nadir point by Dandl et al. (2020) is used: (min distance between prediction of x_interest to desired_prob/_outcome, 1, number of features, 1).


Method predict()

Returns the predictions for the counterfactuals.

Usage

Counterfactuals$predict()


Method subset_to_valid()

Subset data to those meeting the desired prediction, Process could be reverted using revert_subset_to_valid().

Usage

Counterfactuals$subset_to_valid()


Method revert_subset_to_valid()

Subset data to those meeting the desired prediction, Process could be reverted using revert_subset_to_valid().

Usage

Counterfactuals$revert_subset_to_valid()


Method plot_parallel()

Plots a parallel plot that connects the (scaled) feature values of each counterfactual and highlights x_interest in blue.

Usage

Counterfactuals$plot_parallel(
  feature_names = NULL,
  row_ids = NULL,
  digits_min_max = 2L
)

Arguments

feature_names

(character | NULL)
The names of the (numeric) features to display. If NULL (default) all features are displayed.

row_ids

(integerish | NULL)
The row ids of the counterfactuals to display. If NULL (default) all counterfactuals are displayed.

digits_min_max

Maximum number of digits for the minimum and maximum features values. Default is 2L.


Method plot_freq_of_feature_changes()

Plots a bar chart with the frequency of feature changes across all counterfactuals.

Usage

Counterfactuals$plot_freq_of_feature_changes(subset_zero = FALSE)

Arguments

subset_zero

(logical(1))
Should unchanged features be excluded from the plot? Default is FALSE.


Method get_freq_of_feature_changes()

Returns the frequency of feature changes across all counterfactuals.

Usage

Counterfactuals$get_freq_of_feature_changes(subset_zero = FALSE)

Arguments

subset_zero

(logical(1))
Should unchanged features be excluded? Default is FALSE.

Returns

A (named) numeric vector with the frequency of feature changes.


Method plot_surface()

Creates a surface plot for two features. x_interest is represented as a white dot and all counterfactuals that differ from x_interest only in the two selected features are represented as black dots. The tick marks next to the axes show the marginal distribution of the observed data (predictor$data$X).
The exact plot type depends on the selected feature types and number of features:

  • 2 numeric features: surface plot

  • 2 non-numeric features: heatmap

  • 1 numeric or non-numeric feature: line graph

Usage

Counterfactuals$plot_surface(feature_names, grid_size = 250L)

Arguments

feature_names

(character(2))
The names of the features to plot.

grid_size

(integerish(1))
The grid size of the plot. It is ignored in case of two non-numeric features. Default is 250L.


Method print()

Prints the Counterfactuals object.

Usage

Counterfactuals$print()


Method clone()

The objects of this class are cloneable with this method.

Usage

Counterfactuals$clone(deep = FALSE)

Arguments

deep

Whether to make a deep clone.

References

Gower, J. C. (1971), "A general coefficient of similarity and some of its properties". Biometrics, 27, 623–637.