alibi.explainers.cfrl_base module

class alibi.explainers.cfrl_base.Callback[source]

Bases: ABC

Training callback class.

abstract __call__(step, update, model, sample, losses)[source]

Training callback applied after every training step.

  • step (int) – Current experience step.

  • update (int) – Current update step. The ration between the number experience steps and the number of training updates is bound to 1.

  • model (CounterfactualRL) – CounterfactualRL explainer. All the parameters defined in alibi.explainers.cfrl_base.DEFAULT_BASE_PARAMS can be accessed through model.params.

  • sample (Dict[str, ndarray]) –

    Dictionary of samples used for an update which contains

    • 'X' : np.ndarray - input instances.

    • 'Y_m' : np.ndarray - predictor outputs for the input instances.

    • 'Y_t' : np.ndarray - target outputs.

    • 'Z' : np.ndarray - input embeddings.

    • 'Z_cf_tilde' : np.ndarray - noised counterfactual embeddings.

    • 'X_cf_tilde' : np.ndarray - noised counterfactual instances obtained ofter decoding the noised counterfactual embeddings Z_cf_tilde and apply post-processing functions.

    • 'C' : Optional[np.ndarray] - conditional vector.

    • 'R_tilde' : np.ndarray - reward obtained for the noised counterfactual instances.

    • 'Z_cf' : np.ndarray - counterfactual embeddings.

    • 'X_cf' : np.ndarray - counterfactual instances obtained after decoding the counterfactual embeddings Z_cf and apply post-processing functions.

  • losses (Dict[str, float]) –

    Dictionary of losses which contains

Return type:


class alibi.explainers.cfrl_base.CounterfactualRL(predictor, encoder, decoder, coeff_sparsity, coeff_consistency, latent_dim=None, backend='tensorflow', seed=0, **kwargs)[source]

Bases: Explainer, FitMixin

Counterfactual Reinforcement Learning.

__init__(predictor, encoder, decoder, coeff_sparsity, coeff_consistency, latent_dim=None, backend='tensorflow', seed=0, **kwargs)[source]


  • predictor (Callable[[ndarray], ndarray]) – A callable that takes a numpy array of N data points as inputs and returns N outputs. For classification task, the second dimension of the output should match the number of classes. Thus, the output can be either a soft label distribution or a hard label distribution (i.e. one-hot encoding) without affecting the performance since argmax is applied to the predictor’s output.

  • encoder (Union[Model, Module]) – Pretrained encoder network.

  • decoder (Union[Model, Module]) – Pretrained decoder network.

  • coeff_sparsity (float) – Sparsity loss coefficient.

  • coeff_consistency (float) – Consistency loss coefficient.

  • latent_dim (Optional[int]) – Auto-encoder latent dimension. Can be omitted if the actor network is user specified.

  • backend (str) – Deep learning backend: 'tensorflow' | 'pytorch'. Default 'tensorflow'.

  • seed (int) – Seed for reproducibility. The results are not reproducible for 'tensorflow' backend.

  • **kwargs – Used to replace any default parameter from alibi.explainers.cfrl_base.DEFAULT_BASE_PARAMS.

explain(X, Y_t, C=None, batch_size=100)[source]

Explains an input instance

  • X (ndarray) – Instances to be explained.

  • Y_t (ndarray) – Counterfactual targets.

  • C (Optional[ndarray]) – Conditional vectors. If None, it means that no conditioning was used during training (i.e. the conditional_func returns None).

  • batch_size (int) – Batch size to be used when generating counterfactuals.

Return type:



explanationExplanation object containing the counterfactual with additional metadata as attributes. See usage at CFRL examples for details.


Fit the model agnostic counterfactual generator.


X (ndarray) – Training data array.

Return type:



self – The explainer itself.

classmethod load(path, predictor)[source]

Load an explainer from disk.

  • path (Union[str, PathLike]) – Path to a directory containing the saved explainer.

  • predictor (Any) – Model or prediction function used to originally initialize the explainer.

Return type:



An explainer instance.


Resets the predictor.


predictor (Any) – New predictor.

Return type:



Save an explainer to disk. Uses the dill module.


path (Union[str, PathLike]) – Path to a directory. A new directory will be created if one does not exist.

Return type:


alibi.explainers.cfrl_base.DEFAULT_BASE_PARAMS = {'act_high': 1.0, 'act_low': -1.0, 'act_noise': 0.1, 'actor': None, 'actor_hidden_dim': 256, 'backend': 'tensorflow', 'batch_size': 100, 'callbacks': [], 'conditional_func': <function generate_empty_condition>, 'critic': None, 'critic_hidden_dim': 256, 'decoder_inv_preprocessor': <function identity_function>, 'encoder_preprocessor': <function identity_function>, 'exploration_steps': 100, 'lr_actor': 0.001, 'lr_critic': 0.001, 'num_workers': 4, 'optimizer_actor': None, 'optimizer_critic': None, 'postprocessing_funcs': [], 'replay_buffer_size': 1000, 'reward_func': <function get_classification_reward>, 'shuffle': True, 'train_steps': 100000, 'update_after': 10, 'update_every': 1}

Default Counterfactual with Reinforcement Learning parameters.

  • 'act_noise' : float - standard deviation for the normal noise added to the actor for exploration.

  • 'act_low' : float - minimum action value. Each action component takes values between [act_low, act_high].

  • 'act_high' : float - maximum action value. Each action component takes values between [act_low, act_high].

  • 'replay_buffer_size' : int - dimension of the replay buffer in batch_size units. The total memory allocated is proportional with the size x batch_size.

  • 'batch_size' : int - training batch size.

  • 'num_workers' : int - number of workers used by the data loader if 'pytorch' backend is selected.

  • 'shuffle' : bool - whether to shuffle the datasets every epoch.

  • 'exploration_steps' : int - number of exploration steps. For the first exploration_steps, the counterfactual embedding coordinates are sampled uniformly at random from the interval [act_low, act_high].

  • 'update_every' : int - number of steps that should elapse between gradient updates. Regardless of the waiting steps, the ratio of waiting steps to gradient steps is locked to 1.

  • 'update_after' : int - number of steps to wait before start updating the actor and critic. This ensures that the replay buffers is full enough for useful updates.

  • 'backend' : str - backend to be used: 'tensorflow' | 'pytorch'. Default 'tensorflow'.

  • 'train_steps' : int - number of train steps.

  • 'encoder_preprocessor' : Callable - encoder/auto-encoder data preprocessors. Transforms the input data into the format expected by the auto-encoder. By default, the identity function.

  • 'decoder_inv_preprocessor' : Callable - decoder/auto-encoder data inverse preprocessor. Transforms data from the auto-encoder output format to the original input format. Before calling the prediction function, the data is inverse preprocessed to match the original input format. By default, the identity function.

  • 'reward_func' : Callable - element-wise reward function. By default, considers classification task and checks if the counterfactual prediction label matches the target label. Note that this is element-wise, so a tensor is expected to be returned.

  • 'postprocessing_funcs' : List[Postprocessing] - list of post-processing functions. The function are applied in the order, from low to high index. Non-differentiable post-processing can be applied. The function expects as arguments X_cf - the counterfactual instance, X - the original input instance and C - the conditional vector, and returns the post-processed counterfactual instance X_cf_pp which is passed as X_cf for the following functions. By default, no post-processing is applied (empty list).

  • 'conditional_func' : Callable - generates a conditional vector given a pre-processed input instance. By default, the function returns None which is equivalent to no conditioning.

  • 'callbacks' : List[Callback] - list of callback functions applied at the end of each training step.

  • 'actor' : Optional[Union[tensorflow.keras.Model, torch.nn.Module]] - actor network.

  • 'critic; : Optional[Union[tensorflow.keras.Model, torch.nn.Module]] - critic network.

  • 'optimizer_actor' : Optional[Union[tensorflow.keras.optimizers.Optimizer, torch.optim.Optimizer]] - actor optimizer.

  • 'optimizer_critic' : Optional[Union[tensorflow.keras.optimizer.Optimizer, torch.optim.Optimizer]] - critic optimizer.

  • 'lr_actor' : float - actor learning rate.

  • 'lr_critic' : float - critic learning rate.

  • 'actor_hidden_dim' : int - actor hidden layer dimension.

  • 'critic_hidden_dim' : int - critic hidden layer dimension.

class alibi.explainers.cfrl_base.NormalActionNoise(mu, sigma)[source]

Bases: object

Normal noise generator.


Generates normal noise with the appropriate mean and standard deviation.


shape (Tuple[int, ...]) – Shape of the array to be generated

Return type:



Normal noise with the appropriate mean, standard deviation and shape.

__init__(mu, sigma)[source]


  • mu (float) – Mean of the normal noise.

  • sigma (float) – Standard deviation of the noise.

class alibi.explainers.cfrl_base.Postprocessing[source]

Bases: ABC

abstract __call__(X_cf, X, C)[source]

Post-processing function

  • X_cf (Any) – Counterfactual instance. The datatype depends on the output of the decoder. For example, for an image dataset, the output is np.ndarray. For a tabular dataset, the output is List[np.ndarray] where each element of the list corresponds to a feature. This corresponds to the decoder’s output from the heterogeneous autoencoder (see alibi.models.tensorflow.autoencoder.HeAE and alibi.models.pytorch.autoencoder.HeAE).

  • X (ndarray) – Input instance.

  • C (Optional[ndarray]) – Conditional vector. If None, it means that no conditioning was used during training (i.e. the conditional_func returns None).

Return type:



X_cf – Post-processed X_cf.

class alibi.explainers.cfrl_base.ReplayBuffer(size=1000)[source]

Bases: object

Circular experience replay buffer for CounterfactualRL (DDPG). When the buffer is filled, then the oldest experience is replaced by the new one (FIFO). The experience batch size is kept constant and inferred when the first batch of data is stored. Allowing flexible batch size can generate tensorflow warning due to the tf.function retracing, which can lead to a drop in performance.

R_tilde: ndarray

Noise counterfactual rewards buffer.

X: ndarray

Inputs buffer.

Y_m: ndarray

Model’s prediction buffer.

Y_t: ndarray

Counterfactual targets buffer.

Z: ndarray

Input embedding buffer.

Z_cf_tilde: ndarray

Noised counterfactual embedding buffer.




size (int) – Dimension of the buffer in batch size. This that the total memory allocated is proportional with the size x batch_size, where batch_size is inferred from the first array to be stored.

append(X, Y_m, Y_t, Z, Z_cf_tilde, C, R_tilde, **kwargs)[source]

Adds experience to the replay buffer. When the buffer is filled, then the oldest experience is replaced by the new one (FIFO).

  • X (ndarray) – Input array.

  • Y_m (ndarray) – Model’s prediction class of X.

  • Y_t (ndarray) – Counterfactual target class.

  • Z (ndarray) – Input’s embedding.

  • Z_cf_tilde (ndarray) – Noised counterfactual embedding.

  • C (Optional[ndarray]) – Conditional array.

  • R_tilde (ndarray) – Noised counterfactual reward array.

  • **kwargs – Other arguments. Not used.

Return type:



Sample a batch of experience form the replay buffer.

Return type:

Dict[str, Optional[ndarray]]


A batch experience. For a description of the keys and values returned, see parameter descriptions in alibi.explainers.cfrl_base.ReplayBuffer.append() method. The batch size returned is the same as the one passed in the alibi.explainers.cfrl_base.ReplayBuffer.append().