alibi.explainers.ale module
- class alibi.explainers.ale.ALE(predictor, feature_names=None, target_names=None, check_feature_resolution=True, low_resolution_threshold=10, extrapolate_constant=True, extrapolate_constant_perc=10.0, extrapolate_constant_min=0.1)[source]
Bases:
alibi.api.interfaces.Explainer
- __init__(predictor, feature_names=None, target_names=None, check_feature_resolution=True, low_resolution_threshold=10, extrapolate_constant=True, extrapolate_constant_perc=10.0, extrapolate_constant_min=0.1)[source]
Accumulated Local Effects for tabular datasets. Current implementation supports first order feature effects of numerical features.
- Parameters
predictor (
Callable
[[ndarray
],ndarray
]) – A callable that takes in an NxF array as input and outputs an NxT array (N - number of data points, F - number of features, T - number of outputs/targets (e.g. 1 for single output regression, >=2 for classification).feature_names (
Optional
[List
[str
]]) – A list of feature names used for displaying results.target_names (
Optional
[List
[str
]]) – A list of target/output names used for displaying results.check_feature_resolution (
bool
) – If true, the number of unique values is calculated for each feature and if it is less than low_resolution_threshold then the feature values are used for gridpoints instead of quantiles. This may increase the runtime of the algorithm for large datasets.low_resolution_threshold (
int
) – If a feature has at most this many unique values, these are used as the grid points instead of quantiles. This is to avoid situations when the quantile algorithm returns quantiles between discrete values which can result in jumps in the ALE plot obscuring the true effect. Only used if check_feature_resolution is True.extrapolate_constant (
bool
) – If a feature is constant, only one quantile exists where all the data points lie. In this case the ALE value at that poiny is zero, however this may be misleading if the feature does have an effect on the model. If this parameter is set to True, the ALE values are calculated on an interval surrounding the constant value. The interval length is controlled by the extrapolate_constant_perc and extrapolate_constant_min arguments.extrapolate_constant_perc (
float
) – Percentage by which to extrapolate a constant feature value to create an interval for ALE calculation. If q is the constant feature value, creates an interval [q - q/extrapolate_constant_perc, q + q/extrapolate_constant_perc] for which ALE is calculated. Only relevant if extrapolate_constant is set to True.extrapolate_constant_min (
float
) – Controls the minimum extrapolation length for constant features. An interval constructed for constant features is guaranteed to be 2*extrapolate_constant_min wide centered on the feature value. This allows for capturing model behaviour around constant features which have small value so that extrapolate_constant_perc is not so helpful. Only relevant if extrapolate_constant is set to True.
- build_explanation(ale_values, ale0, constant_value, feature_values, feature_deciles, feature_names)[source]
Helper method to build the Explanation object.
- Return type
- explain(X, features=None, min_bin_points=4)[source]
Calculate the ALE curves for each feature with respect to the dataset X.
- Parameters
X (
ndarray
) – An NxF tabular dataset used to calculate the ALE curves. This is typically the training dataset or a representative sample.features (
Optional
[List
[int
]]) – Features for which to calculate ALE.min_bin_points (
int
) – Minimum number of points each discretized interval should contain to ensure more precise ALE estimation.
- Return type
- Returns
An Explanation object containing the data and the metadata of the calculated ALE curves.
- alibi.explainers.ale.adaptive_grid(values, min_bin_points=1)[source]
Find the optimal number of quantiles for the range of values so that each resulting bin contains at least min_bin_points. Uses bisection.
- Parameters
values (
ndarray
) – Array of feature values.min_bin_points (
int
) – Minimum number of points each discretized interval should contain to ensure more precise ALE estimation.
- Return type
- Returns
q – Unique quantiles.
num_quantiles – Number of non-unique quantiles the feature array was subdivided into.
Notes
This is a heuristic procedure since the bisection algorithm is applied to a function which is not monotonic. This will not necessarily find the maximum number of bins the interval can be subdivided into to satisfy the minimum number of points in each resulting bin.
- alibi.explainers.ale.ale_num(predictor, X, feature, min_bin_points=4, check_feature_resolution=True, low_resolution_threshold=10, extrapolate_constant=True, extrapolate_constant_perc=10.0, extrapolate_constant_min=0.1)[source]
Calculate the first order ALE curve for a numerical feature.
- Parameters
predictor (
Callable
) – Model prediction function.X (
ndarray
) – Dataset for which ALE curves are computed.feature (
int
) – Index of the numerical feature for which to calculate ALE.min_bin_points (
int
) – Minimum number of points each discretized interval should contain to ensure more precise ALE estimation.check_feature_resolution (
bool
) – Refer toALE
documentation.low_resolution_threshold (
int
) – Refer toALE
documentation.extrapolate_constant_perc (
float
) – Refer toALE
documentation.extrapolate_constant_min (
float
) – Refer toALE
documentation.
- Return type
Tuple
[ndarray
, …]- Returns
q – Array of quantiles of the input values.
ale – ALE values for each feature at each of the points in q.
ale0 – The constant offset used to center the ALE curves.
- alibi.explainers.ale.bisect_fun(fun, target, lo, hi)[source]
Bisection algorithm for function evaluation with integer support.
Assumes the function is non-decreasing on the interval [lo, hi]. Return an integer value v such that for all x<v, fun(x)<target and for all x>=v fun(x)>=target. This is equivalent to the library function bisect.bisect_left but for functions defined on integers.
- alibi.explainers.ale.get_quantiles(values, num_quantiles=11, interpolation='linear')[source]
Calculate quantiles of values in an array.
- Parameters
values (
ndarray
) – Array of values.num_quantiles (
int
) – Number of quantiles to calculate.
- Return type
ndarray
- Returns
Array of quantiles of the input values.
- alibi.explainers.ale.minimum_satisfied(values, min_bin_points, n)[source]
Calculates whether the partition into bins induced by n quantiles has the minimum number of points in each resulting bin.
- alibi.explainers.ale.plot_ale(exp, features='all', targets='all', n_cols=3, sharey='all', constant=False, ax=None, line_kw=None, fig_kw=None)[source]
Plot ALE curves on matplotlib axes.
- Parameters
exp – An Explanation object produced by a call to the ALE.explain method.
features – A list of features for which to plot the ALE curves or all for all features. Can be a mix of integers denoting feature index or strings denoting entries in exp.feature_names. Defaults to ‘all’.
targets – A list of targets for which to plot the ALE curves or all for all targets. Can be a mix of integers denoting target index or strings denoting entries in exp.target_names. Defaults to ‘all’.
n_cols – Number of columns to organize the resulting plot into.
sharey – A parameter specifying whether the y-axis of the ALE curves should be on the same scale for several features. Possible values are all, row, None.
constant – A parameter specifying whether the constant zeroth order effects should be added to the ALE first order effects.
ax – A matplotlib axes object or a numpy array of matplotlib axes to plot on.
line_kw – Keyword arguments passed to the plt.plot function.
fig_kw – Keyword arguments passed to the fig.set function.
- Returns
An array of matplotlib axes with the resulting ALE plots.