alibi.explainers.ale module

class alibi.explainers.ale.ALE(predictor, feature_names=None, target_names=None, check_feature_resolution=True, low_resolution_threshold=10, extrapolate_constant=True, extrapolate_constant_perc=10.0, extrapolate_constant_min=0.1)[source]

Bases: Explainer

__init__(predictor, feature_names=None, target_names=None, check_feature_resolution=True, low_resolution_threshold=10, extrapolate_constant=True, extrapolate_constant_perc=10.0, extrapolate_constant_min=0.1)[source]

Accumulated Local Effects for tabular datasets. Current implementation supports first order feature effects of numerical features.

Parameters:
  • predictor (Callable[[ndarray], ndarray]) – A callable that takes in an N x F array as input and outputs an N x T array (N - number of data points, F - number of features, T - number of outputs/targets (e.g. 1 for single output regression, >=2 for classification)).

  • feature_names (Optional[List[str]]) – A list of feature names used for displaying results.

  • target_names (Optional[List[str]]) – A list of target/output names used for displaying results.

  • check_feature_resolution (bool) – If True, the number of unique values is calculated for each feature and if it is less than low_resolution_threshold then the feature values are used for grid-points instead of quantiles. This may increase the runtime of the algorithm for large datasets. Only used for features without custom grid-points specified in alibi.explainers.ale.ALE.explain().

  • low_resolution_threshold (int) – If a feature has at most this many unique values, these are used as the grid points instead of quantiles. This is to avoid situations when the quantile algorithm returns quantiles between discrete values which can result in jumps in the ALE plot obscuring the true effect. Only used if check_feature_resolution is True and for features without custom grid-points specified in alibi.explainers.ale.ALE.explain().

  • extrapolate_constant (bool) – If a feature is constant, only one quantile exists where all the data points lie. In this case the ALE value at that point is zero, however this may be misleading if the feature does have an effect on the model. If this parameter is set to True, the ALE values are calculated on an interval surrounding the constant value. The interval length is controlled by the extrapolate_constant_perc and extrapolate_constant_min arguments.

  • extrapolate_constant_perc (float) – Percentage by which to extrapolate a constant feature value to create an interval for ALE calculation. If q is the constant feature value, creates an interval [q - q/extrapolate_constant_perc, q + q/extrapolate_constant_perc] for which ALE is calculated. Only relevant if extrapolate_constant is set to True.

  • extrapolate_constant_min (float) – Controls the minimum extrapolation length for constant features. An interval constructed for constant features is guaranteed to be 2 x extrapolate_constant_min wide centered on the feature value. This allows for capturing model behaviour around constant features which have small value so that extrapolate_constant_perc is not so helpful. Only relevant if extrapolate_constant is set to True.

explain(X, features=None, min_bin_points=4, grid_points=None)[source]

Calculate the ALE curves for each feature with respect to the dataset X.

Parameters:
  • X (ndarray) – An N x F tabular dataset used to calculate the ALE curves. This is typically the training dataset or a representative sample.

  • features (Optional[List[int]]) – Features for which to calculate ALE.

  • min_bin_points (int) – Minimum number of points each discretized interval should contain to ensure more precise ALE estimation. Only relevant for adaptive grid points (i.e., features without an entry in the grid_points dictionary).

  • grid_points (Optional[Dict[int, ndarray]]) – Custom grid points. Must be a dict where the keys are features indices and the values are monotonically increasing numpy arrays defining the grid points for each feature. See the Notes section for the default behavior when potential edge-cases arise when using grid-points. If no grid points are specified (i.e. the feature is missing from the grid_points dictionary), deciles discretization is used instead.

Return type:

Explanation

Returns:

explanation – An Explanation object containing the data and the metadata of the calculated ALE curves. See usage at ALE examples for details.

Notes

Consider f to be a feature of interest. We denote possible feature values of f by X (i.e. the values from the dataset column corresponding to feature f), by O a user-specified grid-point value, and by (X|O) an overlap between a grid-point and a feature value. We can encounter the following edge-cases:

  • Grid points outside the feature range. Consider the following example: O O O X X O X O X O O, where 3 grid-points are smaller than the minimum value in f, and 2 grid-points are larger than the maximum value in f. The empty leading and ending bins are removed. The grid-points considered

will be: O X X O X O X O.

  • Grid points that do not cover the entire feature range. Consider the following example: X X O X X O X O X X X X X. Two auxiliary grid-points are added which correspond the value of the minimum and maximum value of feature f. The grid-points considered will be: (O|X) X O X X O X O X X X X (X|O).

  • Grid points that do not contain any values in between. Consider the following example: (O|X) X X O O O X O X O O (X|O). The intervals which do not contain any feature values are removed/merged. The grid-points considered will be: (O|X) X X O X O X O (X|O).

reset_predictor(predictor)[source]

Resets the predictor function.

Parameters:

predictor (Callable) – New predictor function.

Return type:

None

alibi.explainers.ale.adaptive_grid(values, min_bin_points=1)[source]

Find the optimal number of quantiles for the range of values so that each resulting bin contains at least min_bin_points. Uses bisection.

Parameters:
  • values (ndarray) – Array of feature values.

  • min_bin_points (int) – Minimum number of points each discretized interval should contain to ensure more precise ALE estimation.

Return type:

Tuple[ndarray, int]

Returns:

  • q – Unique quantiles.

  • num_quantiles – Number of non-unique quantiles the feature array was subdivided into.

Notes

This is a heuristic procedure since the bisection algorithm is applied to a function which is not monotonic. This will not necessarily find the maximum number of bins the interval can be subdivided into to satisfy the minimum number of points in each resulting bin.

alibi.explainers.ale.ale_num(predictor, X, feature, feature_grid_points=None, min_bin_points=4, check_feature_resolution=True, low_resolution_threshold=10, extrapolate_constant=True, extrapolate_constant_perc=10.0, extrapolate_constant_min=0.1)[source]

Calculate the first order ALE curve for a numerical feature.

Parameters:
  • predictor (Callable) – Model prediction function.

  • X (ndarray) – Dataset for which ALE curves are computed.

  • feature (int) – Index of the numerical feature for which to calculate ALE.

  • feature_grid_points (Optional[ndarray]) – Custom grid points. An numpy array defining the grid points for the given features.

  • min_bin_points (int) – Minimum number of points each discretized interval should contain to ensure more precise ALE estimation. Only relevant for adaptive grid points (i.e., feature for which feature_grid_points=None).

  • check_feature_resolution (bool) – Refer to ALE documentation.

  • low_resolution_threshold (int) – Refer to ALE documentation.

  • extrapolate_constant (bool) – Refer to ALE documentation.

  • extrapolate_constant_perc (float) – Refer to ALE documentation.

  • extrapolate_constant_min (float) – Refer to ALE documentation.

Return type:

Tuple[ndarray, ...]

Returns:

  • fvals – Array of quantiles or custom grid-points of the input values.

  • ale – ALE values for each feature at each of the points in fvals.

  • ale0 – The constant offset used to center the ALE curves.

alibi.explainers.ale.bisect_fun(fun, target, lo, hi)[source]

Bisection algorithm for function evaluation with integer support.

Assumes the function is non-decreasing on the interval [lo, hi]. Return an integer value v such that for all x<v, fun(x)<target and for all x>=v, fun(x)>=target. This is equivalent to the library function bisect.bisect_left but for functions defined on integers.

Parameters:
  • fun (Callable) – A function defined on integers in the range [lo, hi] and returning floats.

  • target (float) – Target value to be searched for.

  • lo (int) – Lower bound of the domain.

  • hi (int) – Upper bound of the domain.

Return type:

int

Returns:

Integer index.

alibi.explainers.ale.get_quantiles(values, num_quantiles=11, interpolation='linear')[source]

Calculate quantiles of values in an array.

Parameters:
  • values (ndarray) – Array of values.

  • num_quantiles (int) – Number of quantiles to calculate.

Return type:

ndarray

Returns:

Array of quantiles of the input values.

alibi.explainers.ale.minimum_satisfied(values, min_bin_points, n)[source]

Calculates whether the partition into bins induced by n quantiles has the minimum number of points in each resulting bin.

Parameters:
  • values (ndarray) – Array of feature values.

  • min_bin_points (int) – Minimum number of points each discretized interval needs to contain.

  • n (int) – Number of quantiles.

Return type:

int

Returns:

Integer encoded boolean with 1 - each bin has at least min_bin_points and 0 otherwise.

alibi.explainers.ale.plot_ale(exp, features='all', targets='all', n_cols=3, sharey='all', constant=False, ax=None, line_kw=None, fig_kw=None)[source]

Plot ALE curves on matplotlib axes.

Parameters:
  • exp – An Explanation object produced by a call to the alibi.explainers.ale.ALE.explain() method.

  • features – A list of features for which to plot the ALE curves or 'all' for all features. Can be a mix of integers denoting feature index or strings denoting entries in exp.feature_names. Defaults to 'all'.

  • targets – A list of targets for which to plot the ALE curves or 'all' for all targets. Can be a mix of integers denoting target index or strings denoting entries in exp.target_names. Defaults to 'all'.

  • n_cols – Number of columns to organize the resulting plot into.

  • sharey – A parameter specifying whether the y-axis of the ALE curves should be on the same scale for several features. Possible values are: 'all' | 'row' | None.

  • constant – A parameter specifying whether the constant zeroth order effects should be added to the ALE first order effects.

  • ax – A matplotlib axes object or a numpy array of matplotlib axes to plot on.

  • line_kw – Keyword arguments passed to the plt.plot function.

  • fig_kw – Keyword arguments passed to the fig.set function.

Returns:

An array of matplotlib axes with the resulting ALE plots.