alibi_detect.utils.pytorch package

alibi_detect.utils.pytorch.batch_compute_kernel_matrix(x, y, kernel, device=None, batch_size=10000000000, preprocess_fn=None)[source]

Compute the kernel matrix between x and y by filling in blocks of size batch_size x batch_size at a time.

Parameters
  • x (Union[list, ndarray, Tensor]) – Reference set.

  • y (Union[list, ndarray, Tensor]) – Test set.

  • kernel (Union[Module, Sequential]) – PyTorch module.

  • device (Optional[device]) – Device type used. The default None tries to use the GPU and falls back on CPU if needed. Can be specified by passing either torch.device(‘cuda’) or torch.device(‘cpu’).

  • batch_size (int) – Batch size used during prediction.

  • preprocess_fn (Optional[Callable[…, Tensor]]) – Optional preprocessing function for each batch.

Return type

Tensor

Returns

Kernel matrix in the form of a torch tensor

alibi_detect.utils.pytorch.mmd2(x, y, kernel)[source]

Compute MMD^2 between 2 samples.

Parameters
  • x (Tensor) – Batch of instances of shape [Nx, features].

  • y (Tensor) – Batch of instances of shape [Ny, features].

  • kernel (Callable) – Kernel function.

Return type

float

Returns

MMD^2 between the samples x and y.

alibi_detect.utils.pytorch.mmd2_from_kernel_matrix(kernel_mat, m, permute=False, zero_diag=True)[source]

Compute maximum mean discrepancy (MMD^2) between 2 samples x and y from the full kernel matrix between the samples.

Parameters
  • kernel_mat (Tensor) – Kernel matrix between samples x and y.

  • m (int) – Number of instances in y.

  • permute (bool) – Whether to permute the row indices. Used for permutation tests.

  • zero_diag (bool) – Whether to zero out the diagonal of the kernel matrix.

Return type

Tensor

Returns

MMD^2 between the samples from the kernel matrix.

alibi_detect.utils.pytorch.squared_pairwise_distance(x, y, a_min=1e-30)[source]

PyTorch pairwise squared Euclidean distance between samples x and y.

Parameters
  • x (Tensor) – Batch of instances of shape [Nx, features].

  • y (Tensor) – Batch of instances of shape [Ny, features].

  • a_min (float) – Lower bound to clip distance values.

Return type

Tensor

Returns

Pairwise squared Euclidean distance [Nx, Ny].

class alibi_detect.utils.pytorch.GaussianRBF(sigma=None, trainable=False)[source]

Bases: torch.nn.Module

__init__(sigma=None, trainable=False)[source]

Gaussian RBF kernel: k(x,y) = exp(-(1/(2*sigma^2)||x-y||^2). A forward pass takes a batch of instances x [Nx, features] and y [Ny, features] and returns the kernel matrix [Nx, Ny].

Parameters
  • sigma (Optional[Tensor]) – Bandwidth used for the kernel. Needn’t be specified if being inferred or trained. Can pass multiple values to eval kernel with and then average.

  • trainable (bool) – Whether or not to track gradients w.r.t. sigma to allow it to be trained.

Return type

None

forward(x, y, infer_sigma=False)[source]
Return type

Tensor

property sigma
Return type

Tensor

class alibi_detect.utils.pytorch.DeepKernel(proj, kernel_a=torch.nn.Module, kernel_b=torch.nn.Module, eps='trainable')[source]

Bases: torch.nn.Module

” Computes similarities as k(x,y) = (1-eps)*k_a(proj(x), proj(y)) + eps*k_b(x,y). A forward pass takes a batch of instances x [Nx, features] and y [Ny, features] and returns the kernel matrix [Nx, Ny].

Parameters
  • proj (Module) – The projection to be applied to the inputs before applying kernel_a

  • kernel_a (Module) – The kernel to apply to the projected inputs. Defaults to a Gaussian RBF with trainable bandwidth.

  • kernel_b (Optional[Module]) – The kernel to apply to the raw inputs. Defaults to a Gaussian RBF with trainable bandwidth. Set to None in order to use only the deep component (i.e. eps=0).

  • eps (Union[float, str]) – The proportion (in [0,1]) of weight to assign to the kernel applied to raw inputs. This can be either specified or set to ‘trainable’. Only relavent if kernel_b is not None.

property eps
Return type

Tensor

forward(x, y)[source]
Return type

Tensor

alibi_detect.utils.pytorch.permed_lsdds(k_all_c, x_perms, y_perms, H, H_lam_inv=None, lam_rd_max=0.2, return_unpermed=False)[source]

Compute LSDD estimates from kernel matrix across various ref and test window samples

Parameters
  • k_all_c (Tensor) – Kernel matrix of similarities between all samples and the kernel centers.

  • x_perms (List[Tensor]) – List of B reference window index vectors

  • y_perms (List[Tensor]) – List of B test window index vectors

  • H (Tensor) – Special (scaled) kernel matrix of similarities between kernel centers

  • H_lam_inv (Optional[Tensor]) – Function of H corresponding to a particular regulariation parameter lambda. See Eqn 11 of Bu et al. (2017)

  • lam_rd_max (float) – The maximum relative difference between two estimates of LSDD that the regularization parameter lambda is allowed to cause. Defaults to 0.2. Only relavent if H_lam_inv is not supplied.

  • return_unpermed (bool) – Whether or not to return value corresponding to unpermed order defined by k_all_c

Return type

Union[Tuple[Tensor, Tensor], Tuple[Tensor, Tensor, Tensor]]

Returns

  • Vector of B LSDD estimates for each permutation, H_lam_inv which may have been inferred, and optionally

  • the unpermed LSDD estimate.

alibi_detect.utils.pytorch.predict_batch(x, model, device=None, batch_size=10000000000, preprocess_fn=None, dtype=numpy.float32)[source]

Make batch predictions on a model.

Parameters
  • x (Union[list, ndarray, Tensor]) – Batch of instances.

  • model (Union[Callable, Module, Sequential]) – PyTorch model.

  • device (Optional[device]) – Device type used. The default None tries to use the GPU and falls back on CPU if needed. Can be specified by passing either torch.device(‘cuda’) or torch.device(‘cpu’).

  • batch_size (int) – Batch size used during prediction.

  • preprocess_fn (Optional[Callable]) – Optional preprocessing function for each batch.

  • dtype (Union[dtype, dtype]) – Model output type, e.g. np.float32 or torch.float32.

Return type

Union[ndarray, Tensor, tuple]

Returns

Numpy array, torch tensor or tuples of those with model outputs.

alibi_detect.utils.pytorch.predict_batch_transformer(x, model, tokenizer, max_len, device=None, batch_size=10000000000, dtype=numpy.float32)[source]

Make batch predictions using a transformers tokenizer and model.

Parameters
  • x (Union[list, ndarray]) – Batch of instances.

  • model (Union[Module, Sequential]) – PyTorch model.

  • tokenizer (Callable) – Tokenizer for model.

  • max_len (int) – Max sequence length for tokens.

  • device (Optional[device]) – Device type used. The default None tries to use the GPU and falls back on CPU if needed. Can be specified by passing either torch.device(‘cuda’) or torch.device(‘cpu’).

  • batch_size (int) – Batch size used during prediction.

  • dtype (Union[float32, dtype]) – Model output type, e.g. np.float32 or torch.float32.

Return type

Union[ndarray, Tensor]

Returns

Numpy array or torch tensor with model outputs.

alibi_detect.utils.pytorch.quantile(sample, p, type=7, sorted=False)[source]

Estimate a desired quantile of a univariate distribution from a vector of samples

Parameters
Return type

float

Returns

An estimate of the quantile

alibi_detect.utils.pytorch.zero_diag(mat)[source]

Set the diagonal of a matrix to 0

Parameters

mat (Tensor) – A 2D square matrix

Return type

Tensor

Returns

A 2D square matrix with zeros along the diagonal