alibi.prototypes package

The ‘alibi.prototypes’ modules includes prototypes and criticism selection methods.

class alibi.prototypes.ProtoSelect(kernel_distance, eps, lambda_penalty=None, batch_size=10000000000, preprocess_fn=None, verbose=False)[source]

Bases: Summariser, FitMixin

__init__(kernel_distance, eps, lambda_penalty=None, batch_size=10000000000, preprocess_fn=None, verbose=False)[source]

Prototype selection for dataset distillation and interpretable classification proposed by Bien and Tibshirani (2012): https://arxiv.org/abs/1202.5933

Parameters:
  • kernel_distance (Callable[[ndarray, ndarray], ndarray]) – Kernel distance to be used. Expected to support computation in batches. Given an input x of size Nx x f1 x f2 x … and an input y of size Ny x f1 x f2 x …, the kernel distance should return a kernel matrix of size Nx x Ny.

  • eps (float) – Epsilon ball size.

  • lambda_penalty (Optional[float]) – Penalty for each prototype. Encourages a lower number of prototypes to be selected. Corresponds to \(\lambda\) in the paper notation. If not specified, the default value is set to 1 / N where N is the size of the dataset to choose the prototype instances from, passed to the alibi.prototypes.protoselect.ProtoSelect.fit() method.

  • batch_size (int) – Batch size to be used for kernel matrix computation.

  • preprocess_fn (Optional[Callable[[Union[list, ndarray]], ndarray]]) – Preprocessing function used for kernel matrix computation. The preprocessing function takes the input as a list or a numpy array and transforms it into a numpy array which is then fed to the kernel_distance function. The use of preprocess_fn allows the method to be applied to any data modality.

  • verbose (bool) – Whether to display progression bar while computing prototype points.

fit(X, y=None, Z=None)[source]

Fit the summariser. This step forms the kernel matrix in memory which has a shape of NX x NX, where NX is the number of instances in X, if the optional dataset Z is not provided. Otherwise, if the optional dataset Z is provided, the kernel matrix has a shape of NZ x NX, where NZ is the number of instances in Z.

Parameters:
  • X (Union[list, ndarray]) – Dataset to be summarised.

  • y (Optional[ndarray]) – Labels of the dataset X to be summarised. The labels are expected to be represented as integers [0, 1, …, L-1], where L is the number of classes in the dataset X.

  • Z (Union[list, ndarray, None]) – Optional dataset to choose the prototypes from. If Z=None, the prototypes will be selected from the dataset X. Otherwise, if Z is provided, the dataset to be summarised is still X, but it is summarised by prototypes belonging to the dataset Z.

Return type:

ProtoSelect

Returns:

self – Reference to itself.

summarise(num_prototypes=1)[source]

Searches for the requested number of prototypes. Note that the algorithm can return a lower number of prototypes than the requested one. To increase the number of prototypes, reduce the epsilon-ball radius (eps), and the penalty for adding a prototype (lambda_penalty).

Parameters:

num_prototypes (int) – Maximum number of prototypes to be selected.

Return type:

Explanation

Returns:

An Explanation object containing the prototypes, prototype indices and prototype labels with additional metadata as attributes.

alibi.prototypes.visualize_image_prototypes(summary, trainset, reducer, preprocess_fn=None, knn_kw=None, ax=None, fig_kw=None, image_size=(28, 28), zoom_lb=1.0, zoom_ub=3.0)[source]

Plot the images of the prototypes at the location given by the reducer representation. The size of each prototype is proportional to the logarithm of the number of assigned training instances correctly classified according to the 1-KNN classifier (Bien and Tibshirani (2012): https://arxiv.org/abs/1202.5933).

Parameters:
  • summary (Explanation) – An Explanation object produced by a call to the alibi.prototypes.protoselect.ProtoSelect.summarise() method.

  • trainset (Tuple[ndarray, ndarray]) – Tuple, (X_train, y_train), consisting of the training data instances with the corresponding labels.

  • reducer (Callable[[ndarray], ndarray]) – 2D reducer. Reduces the input feature representation to 2D. Note that the reducer operates directly on the input instances if preprocess_fn=None. If the preprocess_fn is specified, the reducer will be called on the feature representation obtained after passing the input instances through the preprocess_fn.

  • preprocess_fn (Optional[Callable[[ndarray], ndarray]]) – Optional preprocessor function. If preprocess_fn=None, no preprocessing is applied.

  • knn_kw (Optional[dict]) – Keyword arguments passed to sklearn.neighbors.KNeighborsClassifier. The n_neighbors will be set automatically to 1, but the metric has to be specified according to the kernel distance used. If the metric is not specified, it will be set by default to 'euclidean'. See parameters description: https://scikit-learn.org/stable/modules/generated/sklearn.neighbors.KNeighborsClassifier.html

  • ax (Optional[Axes]) – A matplotlib axes object to plot on.

  • fig_kw (Optional[dict]) – Keyword arguments passed to the fig.set function.

  • image_size (Tuple[int, int]) – Shape to which the prototype images will be resized. A zoom of 1 will display the image having the shape image_size.

  • zoom_lb (float) – Zoom lower bound. The zoom will be scaled linearly between [zoom_lb, zoom_ub].

  • zoom_ub (float) – Zoom upper bound. The zoom will be scaled linearly between [zoom_lb, zoom_ub].

Return type:

Axes

Submodules