Frequently Asked Questions
I’m getting code errors using a method on my model and my data
There can be many reasons why the method does not work. For code exceptions it is a good idea to check the following:
Check that your model signature (its type and expected inputs/outputs) are in the right format. Typically this means taking as input a
numpyarray representing a batch of data and returning a
numpyarray representing class labels, probabilities or regression values. For further details refer to White-box and black-box models.
Check the expected input type for the
explainmethod. Note that in many cases (e.g. all the Anchor methods) the
explainmethod expects a single instance without a leading batch dimension, e.g. for
AnchorImagea colour image of shape
(height, width, colour_channels).
My model works on different input types, e.g.
pandas dataframes instead of
numpy arrays so the explainers don’t work
At the time of writing we support models that operate on
numpy arrays. You can write simple wrapper functions for your model so that it adheres to the format that
alibi expects, see here. In the future we may support more diverse input types (see #516).
Explanations are taking a long time to complete
Explanations can take different times as a function of the model, the data, and the explanation type itself. Refer to the Introduction and Methods sections for notes on algorithm complexity. You might need to experiment with the type of model, data points (specifically feature cardinality), and method parameters to ascertain if a specific method scales well for your use case.
The explanation returned doesn’t make sense to me
Explanations reflect the decision-making process of the model and not that of a biased human observer, see Biases. Moreover, depending on the method, the data, and the model, the explanations returned are valid but may be harder to interpret (e.g. see Anchor explanations for some examples).
Is there a way I can get more information from the library during the explanation generation process?
Yes! We use Python logging to log info and debug messages from the library. You can configure logging for your script to see these messages during the execution of the code. Additionally, some methods also expose a
verbose argument to print information to standard output. Configuring Python logging for your application will depend on your needs, but for simple scripts you can easily configure logging to print to standard error as follows:
import logging logging.basicConfig(level=logging.DEBUG)
Note: this will display all logged messages with level
DEBUG and higher from all libraries in use.
Why is my anchor explanation empty (tabular or text data) or black (image data)?
This is expected behaviour and signals that there is no salient subset of features that is necessary for the prediction to hold. In other words, with high probability (as measured by the precision), the predicted class of the data point does not change regardless of the perturbations applied to it.
Note: this behaviour can be typical for very imbalanced datasets, see comment from the author.
Why is my anchor explanation so long (tabular or text data) or covers much of the image (image data)?
This is expected behaviour and can happen in two ways:
The data point to be explained lies near the decision boundary of the classifier. Thus, many more predicates are needed to ensure that a data point keeps the predicted class as small changes to the feature values may push the prediction to another class.
For tabular data, sampling perturbations is done using a training set. If the training set is imbalanced, explaining a minority class data point will result in oversampling perturbed features typical of majority classes so the algorithm would struggle to find a short anchor exceeding the specified precision level. For a concrete example, see Anchor explanations for income prediction.
I’m using the methods Counterfactual, CounterfactualProto, or CEM on a tree-based model such as decision trees, random forests, or gradient boosted models (e.g.
xgboost) but not finding any counterfactual examples
These methods only work on a subset of black-box models, namely ones whose decision function is differentiable with respect to the input data and hence amenable to gradient-based counterfactual search. Since for tree-based models, the decision function is piece-wise constant these methods won’t work. It is recommended to use CFRL instead.
I’m getting an error using the methods Counterfactual, CounterfactualProto, or CEM, especially if trying to use one of these methods together with IntegratedGradients or CFRL
At the moment the 3 counterfactual methods are implemented using TensorFlow 1.x constructs. This means that when running these methods, first we need to disable behaviour specific to TensorFlow 2.x as follows:
import tensorflow as tf tf.compat.v1.disable_v2_behavior()
Unfortunately, running this line means it’s impossible to run explainers based on TensorFlow 2.x such as IntegratedGradients or CFRL. Thus at the moment, it is impossible to run these explainers together in the same Python interpreter session. Ultimately the fix is to rewrite the TensorFlow 1.x implementations in idiomatic TensorFlow 2.x and some work has been done but is currently not prioritised.
Why am I’m unable to restrict the features allowed to changed in CounterfactualProto?
This is a known issue with the current implementation, see here and here. It is currently blocked until we migrate the code to use TensorFlow 2.x constructs. In the meantime, it is recommended to use CFRL for counterfactual explanations with flexible feature-range constraints.