alibi.explainers.counterfactual module

class alibi.explainers.counterfactual.CounterFactual(predict_fn, shape, distance_fn='l1', target_proba=1.0, target_class='other', max_iter=1000, early_stop=50, lam_init=0.1, max_lam_steps=10, tol=0.05, learning_rate_init=0.1, feature_range=(-10000000000.0, 10000000000.0), eps=0.01, init='identity', decay=True, write_dir=None, debug=False, sess=None)[source]

Bases: object

__init__(predict_fn, shape, distance_fn='l1', target_proba=1.0, target_class='other', max_iter=1000, early_stop=50, lam_init=0.1, max_lam_steps=10, tol=0.05, learning_rate_init=0.1, feature_range=(-10000000000.0, 10000000000.0), eps=0.01, init='identity', decay=True, write_dir=None, debug=False, sess=None)[source]

Initialize counterfactual explanation method based on Wachter et al. (2017)

Parameters
  • predict_fn (Union[Callable, tensorflow.keras.Model, Model')]) – Keras or TensorFlow model or any other model’s prediction function returning class probabilities

  • shape (Tuple[int, …]) – Shape of input data starting with batch size

  • distance_fn (str) – Distance function to use in the loss term

  • target_proba (float) – Target probability for the counterfactual to reach

  • target_class (Union[str, int]) – Target class for the counterfactual to reach, one of ‘other’, ‘same’ or an integer denoting desired class membership for the counterfactual instance

  • max_iter (int) – Maximum number of interations to run the gradient descent for (inner loop)

  • early_stop (int) – Number of steps after which to terminate gradient descent if all or none of found instances are solutions

  • lam_init (float) – Initial regularization constant for the prediction part of the Wachter loss

  • max_lam_steps (int) – Maximum number of times to adjust the regularization constant (outer loop) before terminating the search

  • tol (float) – Tolerance for the counterfactual target probability

  • learning_rate_init – Initial learning rate for each outer loop of lambda

  • feature_range (Union[Tuple, str]) – Tuple with min and max ranges to allow for perturbed instances. Min and max ranges can be floats or numpy arrays with dimension (1 x nb of features) for feature-wise ranges

  • eps (Union[float, numpy.ndarray]) – Gradient step sizes used in calculating numerical gradients, defaults to a single value for all features, but can be passed an array for feature-wise step sizes

  • init (str) – Initialization method for the search of counterfactuals, currently must be ‘identity’

  • decay (bool) – Flag to decay learning rate to zero for each outer loop over lambda

  • write_dir (str) – Directory to write Tensorboard files to

  • debug (bool) – Flag to write Tensorboard summaries for debugging

  • sess (tensorflow.compat.v1.Session) – Optional Tensorflow session that will be used if passed instead of creating or inferring one internally

Return type

None

explain(X)[source]

Explain an instance and return the counterfactual with metadata.

Parameters

X (numpy.ndarray) – Instance to be explained

Return type

dict

Returns

*explanation - a dictionary containing the counterfactual with additional metadata.*

fit(X, y)[source]

Fit method - currently unused as the counterfactual search is fully unsupervised.

Return type

None