alibi_detect.utils.pytorch package
- class alibi_detect.utils.pytorch.DeepKernel(proj, kernel_a='rbf', kernel_b='rbf', eps='trainable')[source]
Bases:
Module
Computes similarities as k(x,y) = (1-eps)*k_a(proj(x), proj(y)) + eps*k_b(x,y). A forward pass takes a batch of instances x [Nx, features] and y [Ny, features] and returns the kernel matrix [Nx, Ny].
- Parameters:
proj (
Module
) – The projection to be applied to the inputs before applying kernel_akernel_a (
Union
[Module
,str
]) – The kernel to apply to the projected inputs. Defaults to a Gaussian RBF with trainable bandwidth.kernel_b (
Union
[Module
,str
,None
]) – The kernel to apply to the raw inputs. Defaults to a Gaussian RBF with trainable bandwidth. Set to None in order to use only the deep component (i.e. eps=0).eps (
Union
[float
,str
]) – The proportion (in [0,1]) of weight to assign to the kernel applied to raw inputs. This can be either specified or set to ‘trainable’. Only relavent if kernel_b is not None.
- property eps: torch.Tensor
- class alibi_detect.utils.pytorch.GaussianRBF(sigma=None, init_sigma_fn=None, trainable=False)[source]
Bases:
Module
- __init__(sigma=None, init_sigma_fn=None, trainable=False)[source]
Gaussian RBF kernel: k(x,y) = exp(-(1/(2*sigma^2)||x-y||^2). A forward pass takes a batch of instances x [Nx, features] and y [Ny, features] and returns the kernel matrix [Nx, Ny].
- Parameters:
sigma (
Optional
[Tensor
]) – Bandwidth used for the kernel. Needn’t be specified if being inferred or trained. Can pass multiple values to eval kernel with and then average.init_sigma_fn (
Optional
[Callable
]) – Function used to compute the bandwidth sigma. Used when sigma is to be inferred. The function’s signature should matchsigma_median()
, meaning that it should take in the tensors x, y and dist and return sigma. If None, it is set tosigma_median()
.trainable (
bool
) – Whether or not to track gradients w.r.t. sigma to allow it to be trained.
- classmethod from_config(config)[source]
Instantiates a kernel from a config dictionary.
- Parameters:
config – A kernel config dictionary.
- get_config()[source]
Returns a serializable config dict (excluding the input_sigma_fn, which is serialized in alibi_detect.saving).
- Return type:
- property sigma: torch.Tensor
- alibi_detect.utils.pytorch.batch_compute_kernel_matrix(x, y, kernel, device=None, batch_size=10000000000, preprocess_fn=None)[source]
Compute the kernel matrix between x and y by filling in blocks of size batch_size x batch_size at a time.
- Parameters:
kernel (
Union
[Module
,Sequential
]) – PyTorch module.device (
Optional
[device
]) – Device type used. The default None tries to use the GPU and falls back on CPU if needed. Can be specified by passing either torch.device(‘cuda’) or torch.device(‘cpu’).batch_size (
int
) – Batch size used during prediction.preprocess_fn (
Optional
[Callable
[...
,Tensor
]]) – Optional preprocessing function for each batch.
- Return type:
Tensor
- Returns:
Kernel matrix in the form of a torch tensor
- alibi_detect.utils.pytorch.get_device(device=None)[source]
Instantiates a PyTorch device object.
- Parameters:
device (
Union
[Literal
['cuda'
,'gpu'
,'cpu'
],device
,None
]) – Either None, a str (‘gpu’, ‘cuda’ or ‘cpu’) indicating the device to choose, or an already instantiated device object. If None, the GPU is selected if it is detected, otherwise the CPU is used as a fallback.- Return type:
device
- Returns:
The instantiated device object.
- alibi_detect.utils.pytorch.mmd2_from_kernel_matrix(kernel_mat, m, permute=False, zero_diag=True)[source]
Compute maximum mean discrepancy (MMD^2) between 2 samples x and y from the full kernel matrix between the samples.
- Parameters:
- Return type:
Tensor
- Returns:
MMD^2 between the samples from the kernel matrix.
- alibi_detect.utils.pytorch.permed_lsdds(k_all_c, x_perms, y_perms, H, H_lam_inv=None, lam_rd_max=0.2, return_unpermed=False)[source]
Compute LSDD estimates from kernel matrix across various ref and test window samples
- Parameters:
k_all_c (
Tensor
) – Kernel matrix of similarities between all samples and the kernel centers.x_perms (
List
[Tensor
]) – List of B reference window index vectorsy_perms (
List
[Tensor
]) – List of B test window index vectorsH (
Tensor
) – Special (scaled) kernel matrix of similarities between kernel centersH_lam_inv (
Optional
[Tensor
]) – Function of H corresponding to a particular regulariation parameter lambda. See Eqn 11 of Bu et al. (2017)lam_rd_max (
float
) – The maximum relative difference between two estimates of LSDD that the regularization parameter lambda is allowed to cause. Defaults to 0.2. Only relavent if H_lam_inv is not supplied.return_unpermed (
bool
) – Whether or not to return value corresponding to unpermed order defined by k_all_c
- Return type:
- Returns:
Vector of B LSDD estimates for each permutation, H_lam_inv which may have been inferred, and optionally the unpermed LSDD estimate.
- alibi_detect.utils.pytorch.predict_batch(x, model, device=None, batch_size=10000000000, preprocess_fn=None, dtype=<class 'numpy.float32'>)[source]
Make batch predictions on a model.
- Parameters:
model (
Union
[Callable
,Module
,Sequential
]) – PyTorch model.device (
Union
[Literal
['cuda'
,'gpu'
,'cpu'
],device
,None
]) – Device type used. The default tries to use the GPU and falls back on CPU if needed. Can be specified by passing either'cuda'
,'gpu'
,'cpu'
or an instance oftorch.device
.batch_size (
int
) – Batch size used during prediction.preprocess_fn (
Optional
[Callable
]) – Optional preprocessing function for each batch.dtype (
Union
[Type
[generic
],dtype
]) – Model output type, e.g. np.float32 or torch.float32.
- Return type:
- Returns:
Numpy array, torch tensor or tuples of those with model outputs.
- alibi_detect.utils.pytorch.predict_batch_transformer(x, model, tokenizer, max_len, device=None, batch_size=10000000000, dtype=<class 'numpy.float32'>)[source]
Make batch predictions using a transformers tokenizer and model.
- Parameters:
model (
Union
[Module
,Sequential
]) – PyTorch model.tokenizer (
Callable
) – Tokenizer for model.max_len (
int
) – Max sequence length for tokens.device (
Union
[Literal
['cuda'
,'gpu'
,'cpu'
],device
,None
]) – Device type used. The default tries to use the GPU and falls back on CPU if needed. Can be specified by passing either'cuda'
,'gpu'
,'cpu'
or an instance oftorch.device
.batch_size (
int
) – Batch size used during prediction.dtype (
Union
[Type
[generic
],dtype
]) – Model output type, e.g. np.float32 or torch.float32.
- Return type:
- Returns:
Numpy array or torch tensor with model outputs.
- alibi_detect.utils.pytorch.quantile(sample, p, type=7, sorted=False)[source]
Estimate a desired quantile of a univariate distribution from a vector of samples
- Parameters:
sample (
Tensor
) – A 1D vector of valuesp (
float
) – The desired quantile in (0,1)type (
int
) – The method for computing the quantile. See https://wikipedia.org/wiki/Quantile#Estimating_quantiles_from_a_samplesorted (
bool
) – Whether or not the vector is already sorted into ascending order
- Return type:
- Returns:
An estimate of the quantile
- alibi_detect.utils.pytorch.squared_pairwise_distance(x, y, a_min=1e-30)[source]
PyTorch pairwise squared Euclidean distance between samples x and y.
- Parameters:
x (
Tensor
) – Batch of instances of shape [Nx, features].y (
Tensor
) – Batch of instances of shape [Ny, features].a_min (
float
) – Lower bound to clip distance values.
- Return type:
Tensor
- Returns:
Pairwise squared Euclidean distance [Nx, Ny].
- alibi_detect.utils.pytorch.zero_diag(mat)[source]
Set the diagonal of a matrix to 0
- Parameters:
mat (
Tensor
) – A 2D square matrix- Return type:
Tensor
- Returns:
A 2D square matrix with zeros along the diagonal