This page provides a high-level overview of the algorithms and their features currently implemented in Alibi.
These algorithms provide instance-specific (sometimes also called “local”) explanations of ML model predictions. Given a single instance and a model prediction they aim to answer the question “Why did my model make this prediction?” The following table summarizes the capabilities of the current algorithms:
|Explainer||Classification||Regression||Categorical features||Tabular||Text||Images||Needs training set|
Anchor explanations: produce an “anchor” - a small subset of features and their ranges that will almost always result in the same model prediction. Documentation, tabular example, text classification, image classification.
Contrastive explanation method (CEM): produce a pertinent positive (PP) and a pertinent negative (PN) instance. The PP instance finds the features that should me minimally and sufficiently present to predict the same class as the original prediction (a PP acts as the “most compact” representation of the instance to keep the same prediction). The PN instance identifies the features that should be minimally and necessarily absent to maintain the original prediction (a PN acts as the closest instance that would result in a different prediction). Documentation, tabular example, image classification.
These algorihtms provide instance-specific scores measuring the model confidence for making a particular prediction.
|Algorithm||Classification||Regression||Categorical features||Tabular||Text||Images||Needs training set|
Trust scores: produce a “trust score” of a classifier’s prediction. The trust score is the ratio between the distance to the nearest class different from the predicted class and the distance to the predicted class, higher scores correspond to more trustworthy predictions. Documentation, tabular example, image classification
|||Depending on model|
|||May require dimensionality reduction|