alibi.explainers.ale module
- class alibi.explainers.ale.ALE(predictor, feature_names=None, target_names=None, check_feature_resolution=True, low_resolution_threshold=10, extrapolate_constant=True, extrapolate_constant_perc=10.0, extrapolate_constant_min=0.1)[source]
Bases:
Explainer
- __init__(predictor, feature_names=None, target_names=None, check_feature_resolution=True, low_resolution_threshold=10, extrapolate_constant=True, extrapolate_constant_perc=10.0, extrapolate_constant_min=0.1)[source]
Accumulated Local Effects for tabular datasets. Current implementation supports first order feature effects of numerical features.
- Parameters:
predictor (
Callable
[[ndarray
],ndarray
]) – A callable that takes in an N x F array as input and outputs an N x T array (N - number of data points, F - number of features, T - number of outputs/targets (e.g. 1 for single output regression, >=2 for classification)).feature_names (
Optional
[List
[str
]]) – A list of feature names used for displaying results.target_names (
Optional
[List
[str
]]) – A list of target/output names used for displaying results.check_feature_resolution (
bool
) – IfTrue
, the number of unique values is calculated for each feature and if it is less than low_resolution_threshold then the feature values are used for grid-points instead of quantiles. This may increase the runtime of the algorithm for large datasets. Only used for features without custom grid-points specified inalibi.explainers.ale.ALE.explain()
.low_resolution_threshold (
int
) – If a feature has at most this many unique values, these are used as the grid points instead of quantiles. This is to avoid situations when the quantile algorithm returns quantiles between discrete values which can result in jumps in the ALE plot obscuring the true effect. Only used if check_feature_resolution isTrue
and for features without custom grid-points specified inalibi.explainers.ale.ALE.explain()
.extrapolate_constant (
bool
) – If a feature is constant, only one quantile exists where all the data points lie. In this case the ALE value at that point is zero, however this may be misleading if the feature does have an effect on the model. If this parameter is set toTrue
, the ALE values are calculated on an interval surrounding the constant value. The interval length is controlled by the extrapolate_constant_perc and extrapolate_constant_min arguments.extrapolate_constant_perc (
float
) – Percentage by which to extrapolate a constant feature value to create an interval for ALE calculation. If q is the constant feature value, creates an interval [q - q/extrapolate_constant_perc, q + q/extrapolate_constant_perc] for which ALE is calculated. Only relevant if extrapolate_constant is set toTrue
.extrapolate_constant_min (
float
) – Controls the minimum extrapolation length for constant features. An interval constructed for constant features is guaranteed to be 2 x extrapolate_constant_min wide centered on the feature value. This allows for capturing model behaviour around constant features which have small value so that extrapolate_constant_perc is not so helpful. Only relevant if extrapolate_constant is set toTrue
.
- explain(X, features=None, min_bin_points=4, grid_points=None)[source]
Calculate the ALE curves for each feature with respect to the dataset X.
- Parameters:
X (
ndarray
) – An N x F tabular dataset used to calculate the ALE curves. This is typically the training dataset or a representative sample.features (
Optional
[List
[int
]]) – Features for which to calculate ALE.min_bin_points (
int
) – Minimum number of points each discretized interval should contain to ensure more precise ALE estimation. Only relevant for adaptive grid points (i.e., features without an entry in the grid_points dictionary).grid_points (
Optional
[Dict
[int
,ndarray
]]) – Custom grid points. Must be a dict where the keys are features indices and the values are monotonically increasing numpy arrays defining the grid points for each feature. See the Notes section for the default behavior when potential edge-cases arise when using grid-points. If no grid points are specified (i.e. the feature is missing from the grid_points dictionary), deciles discretization is used instead.
- Return type:
- Returns:
explanation – An Explanation object containing the data and the metadata of the calculated ALE curves. See usage at ALE examples for details.
Notes
Consider f to be a feature of interest. We denote possible feature values of f by X (i.e. the values from the dataset column corresponding to feature f), by O a user-specified grid-point value, and by (X|O) an overlap between a grid-point and a feature value. We can encounter the following edge-cases:
Grid points outside the feature range. Consider the following example: O O O X X O X O X O O, where 3 grid-points are smaller than the minimum value in f, and 2 grid-points are larger than the maximum value in f. The empty leading and ending bins are removed. The grid-points considered
will be: O X X O X O X O.
Grid points that do not cover the entire feature range. Consider the following example: X X O X X O X O X X X X X. Two auxiliary grid-points are added which correspond the value of the minimum and maximum value of feature f. The grid-points considered will be: (O|X) X O X X O X O X X X X (X|O).
Grid points that do not contain any values in between. Consider the following example: (O|X) X X O O O X O X O O (X|O). The intervals which do not contain any feature values are removed/merged. The grid-points considered will be: (O|X) X X O X O X O (X|O).
- alibi.explainers.ale.adaptive_grid(values, min_bin_points=1)[source]
Find the optimal number of quantiles for the range of values so that each resulting bin contains at least min_bin_points. Uses bisection.
- Parameters:
values (
ndarray
) – Array of feature values.min_bin_points (
int
) – Minimum number of points each discretized interval should contain to ensure more precise ALE estimation.
- Return type:
- Returns:
q – Unique quantiles.
num_quantiles – Number of non-unique quantiles the feature array was subdivided into.
Notes
This is a heuristic procedure since the bisection algorithm is applied to a function which is not monotonic. This will not necessarily find the maximum number of bins the interval can be subdivided into to satisfy the minimum number of points in each resulting bin.
- alibi.explainers.ale.ale_num(predictor, X, feature, feature_grid_points=None, min_bin_points=4, check_feature_resolution=True, low_resolution_threshold=10, extrapolate_constant=True, extrapolate_constant_perc=10.0, extrapolate_constant_min=0.1)[source]
Calculate the first order ALE curve for a numerical feature.
- Parameters:
predictor (
Callable
) – Model prediction function.X (
ndarray
) – Dataset for which ALE curves are computed.feature (
int
) – Index of the numerical feature for which to calculate ALE.feature_grid_points (
Optional
[ndarray
]) – Custom grid points. An numpy array defining the grid points for the given features.min_bin_points (
int
) – Minimum number of points each discretized interval should contain to ensure more precise ALE estimation. Only relevant for adaptive grid points (i.e., feature for whichfeature_grid_points=None
).check_feature_resolution (
bool
) – Refer toALE
documentation.low_resolution_threshold (
int
) – Refer toALE
documentation.extrapolate_constant_perc (
float
) – Refer toALE
documentation.extrapolate_constant_min (
float
) – Refer toALE
documentation.
- Return type:
- Returns:
fvals – Array of quantiles or custom grid-points of the input values.
ale – ALE values for each feature at each of the points in fvals.
ale0 – The constant offset used to center the ALE curves.
- alibi.explainers.ale.bisect_fun(fun, target, lo, hi)[source]
Bisection algorithm for function evaluation with integer support.
Assumes the function is non-decreasing on the interval [lo, hi]. Return an integer value v such that for all x<v, fun(x)<target and for all x>=v, fun(x)>=target. This is equivalent to the library function bisect.bisect_left but for functions defined on integers.
- alibi.explainers.ale.get_quantiles(values, num_quantiles=11, interpolation='linear')[source]
Calculate quantiles of values in an array.
- Parameters:
values (
ndarray
) – Array of values.num_quantiles (
int
) – Number of quantiles to calculate.
- Return type:
ndarray
- Returns:
Array of quantiles of the input values.
- alibi.explainers.ale.minimum_satisfied(values, min_bin_points, n)[source]
Calculates whether the partition into bins induced by n quantiles has the minimum number of points in each resulting bin.
- alibi.explainers.ale.plot_ale(exp, features='all', targets='all', n_cols=3, sharey='all', constant=False, ax=None, line_kw=None, fig_kw=None)[source]
Plot ALE curves on matplotlib axes.
- Parameters:
exp – An Explanation object produced by a call to the
alibi.explainers.ale.ALE.explain()
method.features – A list of features for which to plot the ALE curves or
'all'
for all features. Can be a mix of integers denoting feature index or strings denoting entries in exp.feature_names. Defaults to'all'
.targets – A list of targets for which to plot the ALE curves or
'all'
for all targets. Can be a mix of integers denoting target index or strings denoting entries in exp.target_names. Defaults to'all'
.n_cols – Number of columns to organize the resulting plot into.
sharey – A parameter specifying whether the y-axis of the ALE curves should be on the same scale for several features. Possible values are:
'all'
|'row'
|None
.constant – A parameter specifying whether the constant zeroth order effects should be added to the ALE first order effects.
ax – A matplotlib axes object or a numpy array of matplotlib axes to plot on.
line_kw – Keyword arguments passed to the plt.plot function.
fig_kw – Keyword arguments passed to the fig.set function.
- Returns:
An array of matplotlib axes with the resulting ALE plots.