alibi_detect.cd package

class alibi_detect.cd.CVMDrift(x_ref, p_val=0.05, x_ref_preprocessed=False, preprocess_at_init=True, update_x_ref=None, preprocess_fn=None, correction='bonferroni', n_features=None, input_shape=None, data_type=None)[source]

Bases: BaseUnivariateDrift

__init__(x_ref, p_val=0.05, x_ref_preprocessed=False, preprocess_at_init=True, update_x_ref=None, preprocess_fn=None, correction='bonferroni', n_features=None, input_shape=None, data_type=None)[source]

Cramer-von Mises (CVM) data drift detector, which tests for any change in the distribution of continuous univariate data. For multivariate data, a separate CVM test is applied to each feature, and the obtained p-values are aggregated via the Bonferroni or False Discovery Rate (FDR) corrections.

Parameters:

x_ref (Union[ndarray, list]) – Data used as reference distribution.
p_val (float) – p-value used for significance of the CVM test. If the FDR correction method is used, this corresponds to the acceptable q-value.
x_ref_preprocessed (bool) – Whether the given reference data x_ref has been preprocessed yet. If x_ref_preprocessed=True, only the test data x will be preprocessed at prediction time. If x_ref_preprocessed=False, the reference data will also be preprocessed.
preprocess_at_init (bool) – Whether to preprocess the reference data when the detector is instantiated. Otherwise, the reference data will be preprocessed at prediction time. Only applies if x_ref_preprocessed=False.
update_x_ref (Optional[Dict[str, int]]) – Reference data can optionally be updated to the last n instances seen by the detector or via reservoir sampling with size n. For the former, the parameter equals {‘last’: n} while for reservoir sampling {‘reservoir_sampling’: n} is passed.
preprocess_fn (Optional[Callable]) – Function to preprocess the data before computing the data drift metrics.
correction (str) – Correction type for multivariate data. Either ‘bonferroni’ or ‘fdr’ (False Discovery Rate).
n_features (Optional[int]) – Number of features used in the CVM test. No need to pass it if no preprocessing takes place. In case of a preprocessing step, this can also be inferred automatically but could be more expensive to compute.
input_shape (Optional[tuple]) – Shape of input data.
data_type (Optional[str]) – Optionally specify the data type (tabular, image or time-series). Added to metadata.

feature_score(x_ref, x)[source]

Performs the two-sample Cramer-von Mises test(s), computing the p-value and test statistic per feature.

Parameters:

x_ref (ndarray) – Reference instances to compare distribution with.
x (ndarray) – Batch of instances.

Return type:

Tuple[ndarray, ndarray]

Returns:

Feature level p-values and CVM statistics.

class alibi_detect.cd.CVMDriftOnline(x_ref, ert, window_sizes, preprocess_fn=None, x_ref_preprocessed=False, n_bootstraps=10000, batch_size=64, n_features=None, verbose=True, input_shape=None, data_type=None)[source]

Bases: BaseUniDriftOnline, DriftConfigMixin

__init__(x_ref, ert, window_sizes, preprocess_fn=None, x_ref_preprocessed=False, n_bootstraps=10000, batch_size=64, n_features=None, verbose=True, input_shape=None, data_type=None)[source]

Online Cramer-von Mises (CVM) data drift detector using preconfigured thresholds, which tests for any change in the distribution of continuous univariate data. This detector is an adaption of that proposed by Ross and Adams [RA12].

For multivariate data, the detector makes a correction similar to the Bonferroni correction used for the offline detector. Given \(d\) features, the detector configures thresholds by targeting the \(1-\beta\) quantile of test statistics over the simulated streams, where \(\beta = 1 - (1-(1/ERT))^{(1/d)}\). For the univariate case, this simplifies to \(\beta = 1/ERT\). At prediction time, drift is flagged if the test statistic of any feature stream exceed the thresholds.

Note

In the multivariate case, for the ERT to be accurately targeted the feature streams must be independent.

Parameters:

x_ref (Union[ndarray, list]) – Data used as reference distribution.
ert (float) – The expected run-time (ERT) in the absence of drift. For the univariate detectors, the ERT is defined as the expected run-time after the smallest window is full i.e. the run-time from t=min(windows_sizes).
window_sizes (List[int]) – window sizes for the sliding test-windows used to compute the test-statistic. Smaller windows focus on responding quickly to severe drift, larger windows focus on ability to detect slight drift.
preprocess_fn (Optional[Callable]) – Function to preprocess the data before computing the data drift metrics.
x_ref_preprocessed (bool) – Whether the given reference data x_ref has been preprocessed yet. If x_ref_preprocessed=True, only the test data x will be preprocessed at prediction time. If x_ref_preprocessed=False, the reference data will also be preprocessed.
n_bootstraps (int) – The number of bootstrap simulations used to configure the thresholds. The larger this is the more accurately the desired ERT will be targeted. Should ideally be at least an order of magnitude larger than the ERT.
batch_size (int) – The maximum number of bootstrap simulations to compute in each batch when configuring thresholds. A smaller batch size reduces memory requirements, but can result in a longer configuration run time.
n_features (Optional[int]) – Number of features used in the statistical test. No need to pass it if no preprocessing takes place. In case of a preprocessing step, this can also be inferred automatically but could be more expensive to compute.
verbose (bool) – Whether or not to print progress during configuration.
input_shape (Optional[tuple]) – Shape of input data.
data_type (Optional[str]) – Optionally specify the data type (tabular, image or time-series). Added to metadata.

online_state_keys: Tuple[str, ...] = ('t', 'test_stats', 'drift_preds', 'xs', 'ids_ref_wins', 'ids_wins_ref', 'ids_wins_wins')

score(x_t)[source]

Compute the test-statistic (CVM) between the reference window(s) and test window. If a given test-window is not yet full then a test-statistic of np.nan is returned for that window.

Parameters:: x_t (Union[ndarray, Any]) – A single instance.
Return type:: ndarray
Returns:: Estimated CVM test statistics between reference window and test window(s).

class alibi_detect.cd.ChiSquareDrift(x_ref, p_val=0.05, categories_per_feature=None, x_ref_preprocessed=False, preprocess_at_init=True, update_x_ref=None, preprocess_fn=None, correction='bonferroni', n_features=None, input_shape=None, data_type=None)[source]

Bases: BaseUnivariateDrift

__init__(x_ref, p_val=0.05, categories_per_feature=None, x_ref_preprocessed=False, preprocess_at_init=True, update_x_ref=None, preprocess_fn=None, correction='bonferroni', n_features=None, input_shape=None, data_type=None)[source]

Chi-Squared data drift detector with Bonferroni or False Discovery Rate (FDR) correction for multivariate data.

Parameters:

x_ref (Union[ndarray, list]) – Data used as reference distribution.
p_val (float) – p-value used for significance of the Chi-Squared test for each feature. If the FDR correction method is used, this corresponds to the acceptable q-value.
categories_per_feature (Optional[Dict[int, int]]) – Optional dictionary with as keys the feature column index and as values the number of possible categorical values for that feature or a list with the possible values. If you know how many categories are present for a given feature you could pass this in the categories_per_feature dict in the Dict[int, int] format, e.g. {0: 3, 3: 2}. If you pass N categories this will assume the possible values for the feature are [0, …, N-1]. You can also explicitly pass the possible categories in the Dict[int, List[int]] format, e.g. {0: [0, 1, 2], 3: [0, 55]}. Note that the categories can be arbitrary int values. If it is not specified, categories_per_feature is inferred from x_ref.
x_ref_preprocessed (bool) – Whether the given reference data x_ref has been preprocessed yet. If x_ref_preprocessed=True, only the test data x will be preprocessed at prediction time. If x_ref_preprocessed=False, the reference data will also be preprocessed.
preprocess_at_init (bool) – Whether to preprocess the reference data when the detector is instantiated. Otherwise, the reference data will be preprocessed at prediction time. Only applies if x_ref_preprocessed=False.
update_x_ref (Optional[Dict[str, int]]) – Reference data can optionally be updated to the last n instances seen by the detector or via reservoir sampling with size n. For the former, the parameter equals {‘last’: n} while for reservoir sampling {‘reservoir_sampling’: n} is passed.
preprocess_fn (Optional[Callable]) – Function to preprocess the data before computing the data drift metrics. Typically a dimensionality reduction technique.
correction (str) – Correction type for multivariate data. Either ‘bonferroni’ or ‘fdr’ (False Discovery Rate).
n_features (Optional[int]) – Number of features used in the Chi-Squared test. No need to pass it if no preprocessing takes place. In case of a preprocessing step, this can also be inferred automatically but could be more expensive to compute.
input_shape (Optional[tuple]) – Shape of input data.
data_type (Optional[str]) – Optionally specify the data type (tabular, image or time-series). Added to metadata.

feature_score(x_ref, x)[source]

Compute Chi-Squared test statistic and p-values per feature.

Parameters:

x_ref (ndarray) – Reference instances to compare distribution with.
x (ndarray) – Batch of instances.

Return type:

Tuple[ndarray, ndarray]

Returns:

Feature level p-values and Chi-Squared statistics.

class alibi_detect.cd.ClassifierDrift(x_ref, model, backend='tensorflow', p_val=0.05, x_ref_preprocessed=False, preprocess_at_init=True, update_x_ref=None, preprocess_fn=None, preds_type='probs', binarize_preds=False, reg_loss_fn=<function ClassifierDrift.<lambda>>, train_size=0.75, n_folds=None, retrain_from_scratch=True, seed=0, optimizer=None, learning_rate=0.001, batch_size=32, preprocess_batch_fn=None, epochs=3, verbose=0, train_kwargs=None, device=None, dataset=None, dataloader=None, input_shape=None, use_calibration=False, calibration_kwargs=None, use_oob=False, data_type=None)[source]

Bases: DriftConfigMixin

__init__(x_ref, model, backend='tensorflow', p_val=0.05, x_ref_preprocessed=False, preprocess_at_init=True, update_x_ref=None, preprocess_fn=None, preds_type='probs', binarize_preds=False, reg_loss_fn=<function ClassifierDrift.<lambda>>, train_size=0.75, n_folds=None, retrain_from_scratch=True, seed=0, optimizer=None, learning_rate=0.001, batch_size=32, preprocess_batch_fn=None, epochs=3, verbose=0, train_kwargs=None, device=None, dataset=None, dataloader=None, input_shape=None, use_calibration=False, calibration_kwargs=None, use_oob=False, data_type=None)[source]

Classifier-based drift detector. The classifier is trained on a fraction of the combined reference and test data and drift is detected on the remaining data. To use all the data to detect drift, a stratified cross-validation scheme can be chosen.

Parameters:

x_ref (Union[ndarray, list]) – Data used as reference distribution.
model (Union[ClassifierMixin, Callable]) – PyTorch, TensorFlow or Sklearn classification model used for drift detection.
backend (str) – Backend used for the training loop implementation. Supported: ‘tensorflow’ | ‘pytorch’ | ‘sklearn’.
p_val (float) – p-value used for the significance of the test.
x_ref_preprocessed (bool) – Whether the given reference data x_ref has been preprocessed yet. If x_ref_preprocessed=True, only the test data x will be preprocessed at prediction time. If x_ref_preprocessed=False, the reference data will also be preprocessed.
preprocess_at_init (bool) – Whether to preprocess the reference data when the detector is instantiated. Otherwise, the reference data will be preprocessed at prediction time. Only applies if x_ref_preprocessed=False.
update_x_ref (Optional[Dict[str, int]]) – Reference data can optionally be updated to the last n instances seen by the detector or via reservoir sampling with size n. For the former, the parameter equals {‘last’: n} while for reservoir sampling {‘reservoir_sampling’: n} is passed.
preprocess_fn (Optional[Callable]) – Function to preprocess the data before computing the data drift metrics.
preds_type (str) – Whether the model outputs ‘probs’ (probabilities - for ‘tensorflow’, ‘pytorch’, ‘sklearn’ models), ‘logits’ (for ‘pytorch’, ‘tensorflow’ models), ‘scores’ (for ‘sklearn’ models if decision_function is supported).
binarize_preds (bool) – Whether to test for discrepancy on soft (e.g. probs/logits/scores) model predictions directly with a K-S test or binarise to 0-1 prediction errors and apply a binomial test.
reg_loss_fn (Callable) – The regularisation term reg_loss_fn(model) is added to the loss function being optimized. Only relevant for ‘tensorflow` and ‘pytorch’ backends.
train_size (Optional[float]) – Optional fraction (float between 0 and 1) of the dataset used to train the classifier. The drift is detected on 1 - train_size. Cannot be used in combination with n_folds.
n_folds (Optional[int]) – Optional number of stratified folds used for training. The model preds are then calculated on all the out-of-fold instances. This allows to leverage all the reference and test data for drift detection at the expense of longer computation. If both train_size and n_folds are specified, n_folds is prioritized.
retrain_from_scratch (bool) – Whether the classifier should be retrained from scratch for each set of test data or whether it should instead continue training from where it left off on the previous set.
seed (int) – Optional random seed for fold selection.
optimizer (Optional[Callable]) – Optimizer used during training of the classifier. Only relevant for ‘tensorflow’ and ‘pytorch’ backends.
learning_rate (float) – Learning rate used by optimizer. Only relevant for ‘tensorflow’ and ‘pytorch’ backends.
batch_size (int) – Batch size used during training of the classifier. Only relevant for ‘tensorflow’ and ‘pytorch’ backends.
preprocess_batch_fn (Optional[Callable]) – Optional batch preprocessing function. For example to convert a list of objects to a batch which can be processed by the model. Only relevant for ‘tensorflow’ and ‘pytorch’ backends.
epochs (int) – Number of training epochs for the classifier for each (optional) fold. Only relevant for ‘tensorflow’ and ‘pytorch’ backends.
verbose (int) – Verbosity level during the training of the classifier. 0 is silent, 1 a progress bar. Only relevant for ‘tensorflow’ and ‘pytorch’ backends.
train_kwargs (Optional[dict]) – Optional additional kwargs when fitting the classifier. Only relevant for ‘tensorflow’ and ‘pytorch’ backends.
device (Union[Literal[‘cuda’, ‘gpu’, ‘cpu’], torch.device, None]) – Device type used. The default tries to use the GPU and falls back on CPU if needed. Can be specified by passing either 'cuda', 'gpu', 'cpu' or an instance of torch.device. Only relevant for ‘pytorch’ backend.
dataset (Optional[Callable]) – Dataset object used during training. Only relevant for ‘tensorflow’ and ‘pytorch’ backends.
dataloader (Optional[Callable]) – Dataloader object used during training. Only relevant for ‘pytorch’ backend.
input_shape (Optional[tuple]) – Shape of input data.
use_calibration (bool) – Whether to use calibration. Calibration can be used on top of any model. Only relevant for ‘sklearn’ backend.
calibration_kwargs (Optional[dict]) – Optional additional kwargs for calibration. Only relevant for ‘sklearn’ backend. See https://scikit-learn.org/stable/modules/generated/sklearn.calibration.CalibratedClassifierCV.html for more details.
use_oob (bool) – Whether to use out-of-bag(OOB) predictions. Supported only for RandomForestClassifier.
data_type (Optional[str]) – Optionally specify the data type (tabular, image or time-series). Added to metadata.

predict(x, return_p_val=True, return_distance=True, return_probs=True, return_model=True)[source]

Predict whether a batch of data has drifted from the reference data.

Parameters:

x (Union[ndarray, list]) – Batch of instances.
return_p_val (bool) – Whether to return the p-value of the test.
return_distance (bool) – Whether to return a notion of strength of the drift. K-S test stat if binarize_preds=False, otherwise relative error reduction.
return_probs (bool) – Whether to return the instance level classifier probabilities for the reference and test data (0=reference data, 1=test data).
return_model (bool) – Whether to return the updated model trained to discriminate reference and test instances.

Return type:

Dict[str, Dict[str, Union[str, int, float, Callable]]]

Returns:

Dictionary containing 'meta' and 'data' dictionaries –

'meta' - has the model’s metadata.
'data' - contains the drift prediction and optionally the p-value, performance of the classifier relative to its expectation under the no-change null, the out-of-fold classifier model prediction probabilities on the reference and test data, and the trained model.

class alibi_detect.cd.ClassifierUncertaintyDrift(x_ref, model, p_val=0.05, x_ref_preprocessed=False, backend=None, update_x_ref=None, preds_type='probs', uncertainty_type='entropy', margin_width=0.1, batch_size=32, preprocess_batch_fn=None, device=None, tokenizer=None, max_len=None, input_shape=None, data_type=None)[source]

Bases: DriftConfigMixin

__init__(x_ref, model, p_val=0.05, x_ref_preprocessed=False, backend=None, update_x_ref=None, preds_type='probs', uncertainty_type='entropy', margin_width=0.1, batch_size=32, preprocess_batch_fn=None, device=None, tokenizer=None, max_len=None, input_shape=None, data_type=None)[source]

Test for a change in the number of instances falling into regions on which the model is uncertain. Performs either a K-S test on prediction entropies or Chi-squared test on 0-1 indicators of predictions falling into a margin of uncertainty (e.g. probs falling into [0.45, 0.55] in binary case).

Parameters:

x_ref (Union[ndarray, list]) – Data used as reference distribution. Should be disjoint from the data the model was trained on for accurate p-values.
model (Callable) – Classification model outputting class probabilities (or logits)
backend (Optional[str]) – Backend to use if model requires batch prediction. Options are ‘tensorflow’ or ‘pytorch’.
p_val (float) – p-value used for the significance of the test.
x_ref_preprocessed (bool) – Whether the given reference data x_ref has been preprocessed yet. If x_ref_preprocessed=True, only the test data x will be preprocessed at prediction time. If x_ref_preprocessed=False, the reference data will also be preprocessed.
update_x_ref (Optional[Dict[str, int]]) – Reference data can optionally be updated to the last n instances seen by the detector or via reservoir sampling with size n. For the former, the parameter equals {‘last’: n} while for reservoir sampling {‘reservoir_sampling’: n} is passed.
preds_type (str) – Type of prediction output by the model. Options are ‘probs’ (in [0,1]) or ‘logits’ (in [-inf,inf]).
uncertainty_type (str) – Method for determining the model’s uncertainty for a given instance. Options are ‘entropy’ or ‘margin’.
margin_width (float) – Width of the margin if uncertainty_type = ‘margin’. The model is considered uncertain on an instance if the highest two class probabilities it assigns to the instance differ by less than margin_width.
batch_size (int) – Batch size used to evaluate model. Only relevant when backend has been specified for batch prediction.
preprocess_batch_fn (Optional[Callable]) – Optional batch preprocessing function. For example to convert a list of objects to a batch which can be processed by the model.
device (Union[Literal[‘cuda’, ‘gpu’, ‘cpu’], torch.device, None]) – Device type used. The default tries to use the GPU and falls back on CPU if needed. Can be specified by passing either 'cuda', 'gpu', 'cpu' or an instance of torch.device. Only relevant for ‘pytorch’ backend.
tokenizer (Optional[Callable]) – Optional tokenizer for NLP models.
max_len (Optional[int]) – Optional max token length for NLP models.
input_shape (Optional[tuple]) – Shape of input data.
data_type (Optional[str]) – Optionally specify the data type (tabular, image or time-series). Added to metadata.

predict(x, return_p_val=True, return_distance=True)[source]

Predict whether a batch of data has drifted from the reference data.

Parameters:

x (Union[ndarray, list]) – Batch of instances.
return_p_val (bool) – Whether to return the p-value of the test.
return_distance (bool) – Whether to return the corresponding test statistic (K-S for ‘entropy’, Chi2 for ‘margin’).

Return type:

Dict[Dict[str, str], Dict[str, Union[ndarray, int, float]]]

Returns:

Dictionary containing 'meta' and 'data' dictionaries. –

'meta' has the model’s metadata.
'data' contains the drift prediction and optionally the p-value, threshold and test statistic.

class alibi_detect.cd.ContextMMDDrift(x_ref, c_ref, backend='tensorflow', p_val=0.05, x_ref_preprocessed=False, preprocess_at_init=True, update_ref=None, preprocess_fn=None, x_kernel=None, c_kernel=None, n_permutations=1000, prop_c_held=0.25, n_folds=5, batch_size=256, device=None, input_shape=None, data_type=None, verbose=False)[source]

Bases: DriftConfigMixin

__init__(x_ref, c_ref, backend='tensorflow', p_val=0.05, x_ref_preprocessed=False, preprocess_at_init=True, update_ref=None, preprocess_fn=None, x_kernel=None, c_kernel=None, n_permutations=1000, prop_c_held=0.25, n_folds=5, batch_size=256, device=None, input_shape=None, data_type=None, verbose=False)[source]

A context-aware drift detector based on a conditional analogue of the maximum mean discrepancy (MMD). Only detects differences between samples that can not be attributed to differences between associated sets of contexts. p-values are computed using a conditional permutation test.

Parameters:

x_ref (Union[ndarray, list]) – Data used as reference distribution.
c_ref (ndarray) – Context for the reference distribution.
backend (str) – Backend used for the MMD implementation.
p_val (float) – p-value used for the significance of the permutation test.
x_ref_preprocessed (bool) – Whether the given reference data x_ref has been preprocessed yet. If x_ref_preprocessed=True, only the test data x will be preprocessed at prediction time. If x_ref_preprocessed=False, the reference data will also be preprocessed.
preprocess_at_init (bool) – Whether to preprocess the reference data when the detector is instantiated. Otherwise, the reference data will be preprocessed at prediction time. Only applies if x_ref_preprocessed=False.
update_ref (Optional[Dict[str, int]]) – Reference data can optionally be updated to the last N instances seen by the detector. The parameter should be passed as a dictionary {‘last’: N}.
preprocess_fn (Optional[Callable]) – Function to preprocess the data before computing the data drift metrics.
x_kernel (Callable) – Kernel defined on the input data, defaults to Gaussian RBF kernel.
c_kernel (Callable) – Kernel defined on the context data, defaults to Gaussian RBF kernel.
n_permutations (int) – Number of permutations used in the permutation test.
prop_c_held (float) – Proportion of contexts held out to condition on.
n_folds (int) – Number of cross-validation folds used when tuning the regularisation parameters.
batch_size (Optional[int]) – If not None, then compute batches of MMDs at a time (rather than all at once).
device (Union[Literal[‘cuda’, ‘gpu’, ‘cpu’], torch.device, None]) – Device type used. The default tries to use the GPU and falls back on CPU if needed. Can be specified by passing either 'cuda', 'gpu', 'cpu' or an instance of torch.device. Only relevant for ‘pytorch’ backend.
input_shape (Optional[tuple]) – Shape of input data.
data_type (Optional[str]) – Optionally specify the data type (tabular, image or time-series). Added to metadata.
verbose (bool) – Whether to print progress messages.

predict(x, c, return_p_val=True, return_distance=True, return_coupling=False)[source]

Predict whether a batch of data has drifted from the reference data, given the provided context.

Parameters:

x (Union[ndarray, list]) – Batch of instances.
c (ndarray) – Context associated with batch of instances.
return_p_val (bool) – Whether to return the p-value of the permutation test.
return_distance (bool) – Whether to return the conditional MMD test statistic between the new batch and reference data.
return_coupling (bool) – Whether to return the coupling matrices.

Return type:

Dict[Dict[str, str], Dict[str, Union[int, float]]]

Returns:

Dictionary containing 'meta' and 'data' dictionaries. –

'meta' has the model’s metadata.
'data' contains the drift prediction and optionally the p-value, threshold, conditional MMD test statistic and coupling matrices.

score(x, c)[source]

Compute the MMD based conditional test statistic, and perform a conditional permutation test to obtain a p-value representing the test statistic’s extremity under the null hypothesis.

Parameters:

x (Union[ndarray, list]) – Batch of instances.
c (ndarray) – Context associated with batch of instances.

Return type:

Tuple[float, float, float, Tuple]

Returns:

p-value obtained from the conditional permutation test, the conditional MMD test statistic, the test statistic threshold above which drift is flagged, and a tuple containing the coupling matrices \((W_{ref,ref}, W_{test,test}, W_{ref,test})\).

class alibi_detect.cd.FETDrift(x_ref, p_val=0.05, x_ref_preprocessed=False, preprocess_at_init=True, update_x_ref=None, preprocess_fn=None, correction='bonferroni', alternative='greater', n_features=None, input_shape=None, data_type=None)[source]

Bases: BaseUnivariateDrift

__init__(x_ref, p_val=0.05, x_ref_preprocessed=False, preprocess_at_init=True, update_x_ref=None, preprocess_fn=None, correction='bonferroni', alternative='greater', n_features=None, input_shape=None, data_type=None)[source]

Fisher exact test (FET) data drift detector, which tests for a change in the mean of binary univariate data. For multivariate data, a separate FET test is applied to each feature, and the obtained p-values are aggregated via the Bonferroni or False Discovery Rate (FDR) corrections.

Parameters:

x_ref (Union[ndarray, list]) – Data used as reference distribution. Data must consist of either [True, False]’s, or [0, 1]’s.
p_val (float) – p-value used for significance of the FET test. If the FDR correction method is used, this corresponds to the acceptable q-value.
x_ref_preprocessed (bool) – Whether the given reference data x_ref has been preprocessed yet. If x_ref_preprocessed=True, only the test data x will be preprocessed at prediction time. If x_ref_preprocessed=False, the reference data will also be preprocessed.
preprocess_at_init (bool) – Whether to preprocess the reference data when the detector is instantiated. Otherwise, the reference data will be preprocessed at prediction time. Only applies if x_ref_preprocessed=False.
update_x_ref (Optional[Dict[str, int]]) – Reference data can optionally be updated to the last n instances seen by the detector or via reservoir sampling with size n. For the former, the parameter equals {‘last’: n} while for reservoir sampling {‘reservoir_sampling’: n} is passed.
preprocess_fn (Optional[Callable]) – Function to preprocess the data before computing the data drift metrics.
correction (str) – Correction type for multivariate data. Either ‘bonferroni’ or ‘fdr’ (False Discovery Rate).
alternative (str) – Defines the alternative hypothesis. Options are ‘greater’, ‘less’ or two-sided. These correspond to an increase, decrease, or any change in the mean of the Bernoulli data.
n_features (Optional[int]) – Number of features used in the FET test. No need to pass it if no preprocessing takes place. In case of a preprocessing step, this can also be inferred automatically but could be more expensive to compute.
input_shape (Optional[tuple]) – Shape of input data.
data_type (Optional[str]) – Optionally specify the data type (tabular, image or time-series). Added to metadata.

feature_score(x_ref, x)[source]

Performs Fisher exact test(s), computing the p-value per feature.

Parameters:

x_ref (ndarray) – Reference instances to compare distribution with. Data must consist of either [True, False]’s, or [0, 1]’s.
x (ndarray) – Batch of instances. Data must consist of either [True, False]’s, or [0, 1]’s.

Return type:

Tuple[ndarray, ndarray]

Returns:

Feature level p-values and odds ratios.

class alibi_detect.cd.FETDriftOnline(x_ref, ert, window_sizes, preprocess_fn=None, x_ref_preprocessed=False, n_bootstraps=10000, t_max=None, alternative='greater', lam=0.99, n_features=None, verbose=True, input_shape=None, data_type=None)[source]

Bases: BaseUniDriftOnline, DriftConfigMixin

__init__(x_ref, ert, window_sizes, preprocess_fn=None, x_ref_preprocessed=False, n_bootstraps=10000, t_max=None, alternative='greater', lam=0.99, n_features=None, verbose=True, input_shape=None, data_type=None)[source]

Online Fisher exact test (FET) data drift detector using preconfigured thresholds, which tests for a change in the mean of binary univariate data. This detector is an adaption of that proposed by Ross et al. [RTA12].

For multivariate data, the detector makes a correction similar to the Bonferroni correction used for the offline detector. Given \(d\) features, the detector configures thresholds by targeting the \(1-\beta\) quantile of test statistics over the simulated streams, where \(\beta = 1 - (1-(1/ERT))^{(1/d)}\). For the univariate case, this simplifies to \(\beta = 1/ERT\). At prediction time, drift is flagged if the test statistic of any feature stream exceed the thresholds.

Note

In the multivariate case, for the ERT to be accurately targeted the feature streams must be independent.

Parameters:

x_ref (Union[ndarray, list]) – Data used as reference distribution.
ert (float) – The expected run-time (ERT) in the absence of drift. For the univariate detectors, the ERT is defined as the expected run-time after the smallest window is full i.e. the run-time from t=min(windows_sizes).
window_sizes (List[int]) – window sizes for the sliding test-windows used to compute the test-statistic. Smaller windows focus on responding quickly to severe drift, larger windows focus on ability to detect slight drift.
preprocess_fn (Optional[Callable]) – Function to preprocess the data before computing the data drift metrics.
x_ref_preprocessed (bool) – Whether the given reference data x_ref has been preprocessed yet. If x_ref_preprocessed=True, only the test data x will be preprocessed at prediction time. If x_ref_preprocessed=False, the reference data will also be preprocessed.
n_bootstraps (int) – The number of bootstrap simulations used to configure the thresholds. The larger this is the more accurately the desired ERT will be targeted. Should ideally be at least an order of magnitude larger than the ERT.
t_max (Optional[int]) – Length of the streams to simulate when configuring thresholds. If None, this is set to 2 * max(window_sizes) - 1.
alternative (str) – Defines the alternative hypothesis. Options are ‘greater’ or ‘less’, which correspond to an increase or decrease in the mean of the Bernoulli stream.
lam (float) – Smoothing coefficient used for exponential moving average.
n_features (Optional[int]) – Number of features used in the statistical test. No need to pass it if no preprocessing takes place. In case of a preprocessing step, this can also be inferred automatically but could be more expensive to compute.
verbose (bool) – Whether or not to print progress during configuration.
input_shape (Optional[tuple]) – Shape of input data.
data_type (Optional[str]) – Optionally specify the data type (tabular, image or time-series). Added to metadata.

online_state_keys: Tuple[str, ...] = ('t', 'test_stats', 'drift_preds', 'xs')

score(x_t)[source]

Compute the test-statistic (FET) between the reference window(s) and test window. If a given test-window is not yet full then a test-statistic of np.nan is returned for that window.

Parameters:: x_t (Union[ndarray, Any]) – A single instance.
Return type:: ndarray
Returns:: Estimated FET test statistics (1-p_val) between reference window and test windows.

class alibi_detect.cd.KSDrift(x_ref, p_val=0.05, x_ref_preprocessed=False, preprocess_at_init=True, update_x_ref=None, preprocess_fn=None, correction='bonferroni', alternative='two-sided', n_features=None, input_shape=None, data_type=None)[source]

Bases: BaseUnivariateDrift

__init__(x_ref, p_val=0.05, x_ref_preprocessed=False, preprocess_at_init=True, update_x_ref=None, preprocess_fn=None, correction='bonferroni', alternative='two-sided', n_features=None, input_shape=None, data_type=None)[source]

Kolmogorov-Smirnov (K-S) data drift detector with Bonferroni or False Discovery Rate (FDR) correction for multivariate data.

Parameters:

x_ref (Union[ndarray, list]) – Data used as reference distribution.
p_val (float) – p-value used for significance of the K-S test for each feature. If the FDR correction method is used, this corresponds to the acceptable q-value.
x_ref_preprocessed (bool) – Whether the given reference data x_ref has been preprocessed yet. If x_ref_preprocessed=True, only the test data x will be preprocessed at prediction time. If x_ref_preprocessed=False, the reference data will also be preprocessed.
preprocess_at_init (bool) – Whether to preprocess the reference data when the detector is instantiated. Otherwise, the reference data will be preprocessed at prediction time. Only applies if x_ref_preprocessed=False.
update_x_ref (Optional[Dict[str, int]]) – Reference data can optionally be updated to the last n instances seen by the detector or via reservoir sampling with size n. For the former, the parameter equals {‘last’: n} while for reservoir sampling {‘reservoir_sampling’: n} is passed.
preprocess_fn (Optional[Callable]) – Function to preprocess the data before computing the data drift metrics. Typically a dimensionality reduction technique.
correction (str) – Correction type for multivariate data. Either ‘bonferroni’ or ‘fdr’ (False Discovery Rate).
alternative (str) – Defines the alternative hypothesis. Options are ‘two-sided’, ‘less’ or ‘greater’.
n_features (Optional[int]) – Number of features used in the K-S test. No need to pass it if no preprocessing takes place. In case of a preprocessing step, this can also be inferred automatically but could be more expensive to compute.
input_shape (Optional[tuple]) – Shape of input data.
data_type (Optional[str]) – Optionally specify the data type (tabular, image or time-series). Added to metadata.

feature_score(x_ref, x)[source]

Compute K-S scores and statistics per feature.

Parameters:

x_ref (ndarray) – Reference instances to compare distribution with.
x (ndarray) – Batch of instances.

Return type:

Tuple[ndarray, ndarray]

Returns:

Feature level p-values and K-S statistics.

class alibi_detect.cd.LSDDDrift(x_ref, backend='tensorflow', p_val=0.05, x_ref_preprocessed=False, preprocess_at_init=True, update_x_ref=None, preprocess_fn=None, sigma=None, n_permutations=100, n_kernel_centers=None, lambda_rd_max=0.2, device=None, input_shape=None, data_type=None)[source]

Bases: DriftConfigMixin

__init__(x_ref, backend='tensorflow', p_val=0.05, x_ref_preprocessed=False, preprocess_at_init=True, update_x_ref=None, preprocess_fn=None, sigma=None, n_permutations=100, n_kernel_centers=None, lambda_rd_max=0.2, device=None, input_shape=None, data_type=None)[source]

Least-squares density difference (LSDD) data drift detector using a permutation test.

Parameters:

x_ref (Union[ndarray, list]) – Data used as reference distribution.
backend (str) – Backend used for the LSDD implementation.
p_val (float) – p-value used for the significance of the permutation test.
x_ref_preprocessed (bool) – Whether the given reference data x_ref has been preprocessed yet. If x_ref_preprocessed=True, only the test data x will be preprocessed at prediction time. If x_ref_preprocessed=False, the reference data will also be preprocessed.
preprocess_at_init (bool) – Whether to preprocess the reference data when the detector is instantiated. Otherwise, the reference data will be preprocessed at prediction time. Only applies if x_ref_preprocessed=False.
update_x_ref (Optional[Dict[str, int]]) – Reference data can optionally be updated to the last n instances seen by the detector or via reservoir sampling with size n. For the former, the parameter equals {‘last’: n} while for reservoir sampling {‘reservoir_sampling’: n} is passed.
preprocess_fn (Optional[Callable]) – Function to preprocess the data before computing the data drift metrics.
sigma (Optional[ndarray]) – Optionally set the bandwidth of the Gaussian kernel used in estimating the LSDD. Can also pass multiple bandwidth values as an array. The kernel evaluation is then averaged over those bandwidths. If sigma is not specified, the ‘median heuristic’ is adopted whereby sigma is set as the median pairwise distance between reference samples.
n_permutations (int) – Number of permutations used in the permutation test.
n_kernel_centers (Optional[int]) – The number of reference samples to use as centers in the Gaussian kernel model used to estimate LSDD. Defaults to 1/20th of the reference data.
lambda_rd_max (float) – The maximum relative difference between two estimates of LSDD that the regularization parameter lambda is allowed to cause. Defaults to 0.2 as in the paper.
device (Union[Literal[‘cuda’, ‘gpu’, ‘cpu’], torch.device, None]) – Device type used. The default tries to use the GPU and falls back on CPU if needed. Can be specified by passing either 'cuda', 'gpu', 'cpu' or an instance of torch.device. Only relevant for ‘pytorch’ backend.
input_shape (Optional[tuple]) – Shape of input data.
data_type (Optional[str]) – Optionally specify the data type (tabular, image or time-series). Added to metadata.

get_config()[source]

Get the detector’s configuration dictionary.

Return type:: dict
Returns:: The detector’s configuration dictionary.

predict(x, return_p_val=True, return_distance=True)[source]

Predict whether a batch of data has drifted from the reference data.

Parameters:

x (Union[ndarray, list]) – Batch of instances.
return_p_val (bool) – Whether to return the p-value of the permutation test.
return_distance (bool) – Whether to return the LSDD metric between the new batch and reference data.

Return type:

Dict[Dict[str, str], Dict[str, Union[int, float]]]

Returns:

Dictionary containing 'meta' and 'data' dictionaries. –

'meta' has the model’s metadata.
'data' contains the drift prediction and optionally the p-value, threshold and LSDD metric.

score(x)[source]

Compute the p-value resulting from a permutation test using the least-squares density difference as a distance measure between the reference data and the data to be tested.

Parameters:: x (Union[ndarray, list]) – Batch of instances.
Return type:: Tuple[float, float, float]
Returns:: p-value obtained from the permutation test, the LSDD between the reference and test set, and the LSDD threshold above which drift is flagged.

class alibi_detect.cd.LSDDDriftOnline(x_ref, ert, window_size, backend='tensorflow', preprocess_fn=None, x_ref_preprocessed=False, sigma=None, n_bootstraps=1000, n_kernel_centers=None, lambda_rd_max=0.2, device=None, verbose=True, input_shape=None, data_type=None)[source]

Bases: DriftConfigMixin

__init__(x_ref, ert, window_size, backend='tensorflow', preprocess_fn=None, x_ref_preprocessed=False, sigma=None, n_bootstraps=1000, n_kernel_centers=None, lambda_rd_max=0.2, device=None, verbose=True, input_shape=None, data_type=None)[source]

Online least squares density difference (LSDD) data drift detector using preconfigured thresholds. Motivated by Bu et al. (2017): https://ieeexplore.ieee.org/abstract/document/7890493 We have made modifications such that a desired ERT can be accurately targeted however.

Parameters:

x_ref (Union[ndarray, list]) – Data used as reference distribution.
ert (float) – The expected run-time (ERT) in the absence of drift. For the multivariate detectors, the ERT is defined as the expected run-time from t=0.
window_size (int) – The size of the sliding test-window used to compute the test-statistic. Smaller windows focus on responding quickly to severe drift, larger windows focus on ability to detect slight drift.
backend (str) – Backend used for the LSDD implementation and configuration.
preprocess_fn (Optional[Callable]) – Function to preprocess the data before computing the data drift metrics.
x_ref_preprocessed (bool) – Whether the given reference data x_ref has been preprocessed yet. If x_ref_preprocessed=True, only the test data x will be preprocessed at prediction time. If x_ref_preprocessed=False, the reference data will also be preprocessed.
sigma (Optional[ndarray]) – Optionally set the bandwidth of the Gaussian kernel used in estimating the LSDD. Can also pass multiple bandwidth values as an array. The kernel evaluation is then averaged over those bandwidths. If sigma is not specified, the ‘median heuristic’ is adopted whereby sigma is set as the median pairwise distance between reference samples.
n_bootstraps (int) – The number of bootstrap simulations used to configure the thresholds. The larger this is the more accurately the desired ERT will be targeted. Should ideally be at least an order of magnitude larger than the ert.
n_kernel_centers (Optional[int]) – The number of reference samples to use as centers in the Gaussian kernel model used to estimate LSDD. Defaults to 2*window_size.
lambda_rd_max (float) – The maximum relative difference between two estimates of LSDD that the regularization parameter lambda is allowed to cause. Defaults to 0.2 as in the paper.
device (Union[Literal[‘cuda’, ‘gpu’, ‘cpu’], torch.device, None]) – Device type used. The default tries to use the GPU and falls back on CPU if needed. Can be specified by passing either 'cuda', 'gpu', 'cpu' or an instance of torch.device. Only relevant for ‘pytorch’ backend.
verbose (bool) – Whether or not to print progress during configuration.
input_shape (Optional[tuple]) – Shape of input data.
data_type (Optional[str]) – Optionally specify the data type (tabular, image or time-series). Added to metadata.

get_config()[source]

Get the detector’s configuration dictionary.

Return type:: dict
Returns:: The detector’s configuration dictionary.

load_state(filepath)[source]

Load the detector’s state from disk, in order to restart from a checkpoint previously generated with save_state().

Parameters:: filepath (Union[str, PathLike]) – The directory to load state from.

predict(x_t, return_test_stat=True)[source]

Predict whether the most recent window of data has drifted from the reference data.

Parameters:

x_t (Union[ndarray, Any]) – A single instance to be added to the test-window.
return_test_stat (bool) – Whether to return the test statistic (LSDD) and threshold.

Return type:

Dict[Dict[str, str], Dict[str, Union[int, float]]]

Returns:

Dictionary containing 'meta' and 'data' dictionaries. –

'meta' has the model’s metadata.
'data' contains the drift prediction and optionally the test-statistic and threshold.

reset_state()[source]: Resets the detector to its initial state (t=0). This does not include reconfiguring thresholds.

save_state(filepath)[source]

Save a detector’s state to disk in order to generate a checkpoint.

Parameters:: filepath (Union[str, PathLike]) – The directory to save state to.

score(x_t)[source]

Compute the test-statistic (LSDD) between the reference window and test window.

Parameters:: x_t (Union[ndarray, Any]) – A single instance to be added to the test-window.
Return type:: float
Returns:: LSDD estimate between reference window and test window.

property t

property test_stats

property thresholds

class alibi_detect.cd.LearnedKernelDrift(x_ref, kernel, backend='tensorflow', p_val=0.05, x_ref_preprocessed=False, preprocess_at_init=True, update_x_ref=None, preprocess_fn=None, n_permutations=100, batch_size_permutations=1000000, var_reg=1e-05, reg_loss_fn=<function LearnedKernelDrift.<lambda>>, train_size=0.75, retrain_from_scratch=True, optimizer=None, learning_rate=0.001, batch_size=32, batch_size_predict=32, preprocess_batch_fn=None, epochs=3, num_workers=0, verbose=0, train_kwargs=None, device=None, dataset=None, dataloader=None, input_shape=None, data_type=None)[source]

Bases: DriftConfigMixin

__init__(x_ref, kernel, backend='tensorflow', p_val=0.05, x_ref_preprocessed=False, preprocess_at_init=True, update_x_ref=None, preprocess_fn=None, n_permutations=100, batch_size_permutations=1000000, var_reg=1e-05, reg_loss_fn=<function LearnedKernelDrift.<lambda>>, train_size=0.75, retrain_from_scratch=True, optimizer=None, learning_rate=0.001, batch_size=32, batch_size_predict=32, preprocess_batch_fn=None, epochs=3, num_workers=0, verbose=0, train_kwargs=None, device=None, dataset=None, dataloader=None, input_shape=None, data_type=None)[source]

Maximum Mean Discrepancy (MMD) data drift detector where the kernel is trained to maximise an estimate of the test power. The kernel is trained on a split of the reference and test instances and then the MMD is evaluated on held out instances and a permutation test is performed.

For details see Liu et al (2020): Learning Deep Kernels for Non-Parametric Two-Sample Tests (https://arxiv.org/abs/2002.09116)

Parameters:

x_ref (Union[ndarray, list]) – Data used as reference distribution.
kernel (Callable) – Trainable PyTorch or TensorFlow module that returns a similarity between two instances.
backend (str) – Backend used by the kernel and training loop.
p_val (float) – p-value used for the significance of the test.
x_ref_preprocessed (bool) – Whether the given reference data x_ref has been preprocessed yet. If x_ref_preprocessed=True, only the test data x will be preprocessed at prediction time. If x_ref_preprocessed=False, the reference data will also be preprocessed.
preprocess_at_init (bool) – Whether to preprocess the reference data when the detector is instantiated. Otherwise, the reference data will be preprocessed at prediction time. Only applies if x_ref_preprocessed=False.
update_x_ref (Optional[Dict[str, int]]) – Reference data can optionally be updated to the last n instances seen by the detector or via reservoir sampling with size n. For the former, the parameter equals {‘last’: n} while for reservoir sampling {‘reservoir_sampling’: n} is passed.
preprocess_fn (Optional[Callable]) – Function to preprocess the data before applying the kernel.
n_permutations (int) – The number of permutations to use in the permutation test once the MMD has been computed.
batch_size_permutations (int) – KeOps computes the n_permutations of the MMD^2 statistics in chunks of batch_size_permutations. Only relevant for ‘keops’ backend.
var_reg (float) – Constant added to the estimated variance of the MMD for stability.
reg_loss_fn (Callable) – The regularisation term reg_loss_fn(kernel) is added to the loss function being optimized.
train_size (Optional[float]) – Optional fraction (float between 0 and 1) of the dataset used to train the kernel. The drift is detected on 1 - train_size.
retrain_from_scratch (bool) – Whether the kernel should be retrained from scratch for each set of test data or whether it should instead continue training from where it left off on the previous set.
optimizer (Optional[Callable]) – Optimizer used during training of the kernel.
learning_rate (float) – Learning rate used by optimizer.
batch_size (int) – Batch size used during training of the kernel.
batch_size_predict (int) – Batch size used for the trained drift detector predictions.
preprocess_batch_fn (Optional[Callable]) – Optional batch preprocessing function. For example to convert a list of objects to a batch which can be processed by the kernel.
epochs (int) – Number of training epochs for the kernel. Corresponds to the smaller of the reference and test sets.
num_workers (int) – Number of workers for the dataloader. The default (num_workers=0) means multi-process data loading is disabled. Setting num_workers>0 may be unreliable on Windows.
verbose (int) – Verbosity level during the training of the kernel. 0 is silent, 1 a progress bar.
train_kwargs (Optional[dict]) – Optional additional kwargs when training the kernel.
device (Union[Literal[‘cuda’, ‘gpu’, ‘cpu’], torch.device, None]) – Device type used. The default tries to use the GPU and falls back on CPU if needed. Can be specified by passing either 'cuda', 'gpu', 'cpu' or an instance of torch.device. Relevant for ‘pytorch’ and ‘keops’ backends.
dataset (Optional[Callable]) – Dataset object used during training.
dataloader (Optional[Callable]) – Dataloader object used during training. Relevant for ‘pytorch’ and ‘keops’ backends.
input_shape (Optional[tuple]) – Shape of input data.
data_type (Optional[str]) – Optionally specify the data type (tabular, image or time-series). Added to metadata.

predict(x, return_p_val=True, return_distance=True, return_kernel=True)[source]

Predict whether a batch of data has drifted from the reference data.

Parameters:

x (Union[ndarray, list]) – Batch of instances.
return_p_val (bool) – Whether to return the p-value of the permutation test.
return_distance (bool) – Whether to return the MMD metric between the new batch and reference data.
return_kernel (bool) – Whether to return the updated kernel trained to discriminate reference and test instances.

Return type:

Dict[Dict[str, str], Dict[str, Union[int, float, Callable]]]

Returns:

Dictionary containing 'meta' and 'data' dictionaries. –

'meta' has the detector’s metadata.
'data' contains the drift prediction and optionally the p-value, threshold, MMD metric and trained kernel.

class alibi_detect.cd.MMDDrift(x_ref, backend='tensorflow', p_val=0.05, x_ref_preprocessed=False, preprocess_at_init=True, update_x_ref=None, preprocess_fn=None, kernel=None, sigma=None, configure_kernel_from_x_ref=True, n_permutations=100, batch_size_permutations=1000000, device=None, input_shape=None, data_type=None)[source]

Bases: DriftConfigMixin

__init__(x_ref, backend='tensorflow', p_val=0.05, x_ref_preprocessed=False, preprocess_at_init=True, update_x_ref=None, preprocess_fn=None, kernel=None, sigma=None, configure_kernel_from_x_ref=True, n_permutations=100, batch_size_permutations=1000000, device=None, input_shape=None, data_type=None)[source]

Maximum Mean Discrepancy (MMD) data drift detector using a permutation test.

Parameters:

x_ref (Union[ndarray, list]) – Data used as reference distribution.
backend (str) – Backend used for the MMD implementation.
p_val (float) – p-value used for the significance of the permutation test.
x_ref_preprocessed (bool) – Whether the given reference data x_ref has been preprocessed yet. If x_ref_preprocessed=True, only the test data x will be preprocessed at prediction time. If x_ref_preprocessed=False, the reference data will also be preprocessed.
preprocess_at_init (bool) – Whether to preprocess the reference data when the detector is instantiated. Otherwise, the reference data will be preprocessed at prediction time. Only applies if x_ref_preprocessed=False.
update_x_ref (Optional[Dict[str, int]]) – Reference data can optionally be updated to the last n instances seen by the detector or via reservoir sampling with size n. For the former, the parameter equals {‘last’: n} while for reservoir sampling {‘reservoir_sampling’: n} is passed.
preprocess_fn (Optional[Callable]) – Function to preprocess the data before computing the data drift metrics.
kernel (Callable) – Kernel used for the MMD computation, defaults to Gaussian RBF kernel.
sigma (Optional[ndarray]) – Optionally set the GaussianRBF kernel bandwidth. Can also pass multiple bandwidth values as an array. The kernel evaluation is then averaged over those bandwidths.
configure_kernel_from_x_ref (bool) – Whether to already configure the kernel bandwidth from the reference data.
n_permutations (int) – Number of permutations used in the permutation test.
batch_size_permutations (int) – KeOps computes the n_permutations of the MMD^2 statistics in chunks of batch_size_permutations. Only relevant for ‘keops’ backend.
device (Union[Literal[‘cuda’, ‘gpu’, ‘cpu’], torch.device, None]) – Device type used. The default tries to use the GPU and falls back on CPU if needed. Can be specified by passing either 'cuda', 'gpu', 'cpu' or an instance of torch.device. Only relevant for ‘pytorch’ backend.
input_shape (Optional[tuple]) – Shape of input data.
data_type (Optional[str]) – Optionally specify the data type (tabular, image or time-series). Added to metadata.

predict(x, return_p_val=True, return_distance=True)[source]

Predict whether a batch of data has drifted from the reference data.

Parameters:

x (Union[ndarray, list]) – Batch of instances.
return_p_val (bool) – Whether to return the p-value of the permutation test.
return_distance (bool) – Whether to return the MMD metric between the new batch and reference data.

Return type:

Dict[Dict[str, str], Dict[str, Union[int, float]]]

Returns:

Dictionary containing 'meta' and 'data' dictionaries. –

'meta' has the model’s metadata.
'data' contains the drift prediction and optionally the p-value, threshold and MMD metric.

score(x)[source]

Compute the p-value resulting from a permutation test using the maximum mean discrepancy as a distance measure between the reference data and the data to be tested.

Parameters:: x (Union[ndarray, list]) – Batch of instances.
Return type:: Tuple[float, float, float]
Returns:: p-value obtained from the permutation test, the MMD^2 between the reference and test set, and the MMD^2 threshold above which drift is flagged.

class alibi_detect.cd.MMDDriftOnline(x_ref, ert, window_size, backend='tensorflow', preprocess_fn=None, x_ref_preprocessed=False, kernel=None, sigma=None, n_bootstraps=1000, device=None, verbose=True, input_shape=None, data_type=None)[source]

Bases: DriftConfigMixin

__init__(x_ref, ert, window_size, backend='tensorflow', preprocess_fn=None, x_ref_preprocessed=False, kernel=None, sigma=None, n_bootstraps=1000, device=None, verbose=True, input_shape=None, data_type=None)[source]

Online maximum Mean Discrepancy (MMD) data drift detector using preconfigured thresholds.

Parameters:

x_ref (Union[ndarray, list]) – Data used as reference distribution.
ert (float) – The expected run-time (ERT) in the absence of drift. For the multivariate detectors, the ERT is defined as the expected run-time from t=0.
window_size (int) – The size of the sliding test-window used to compute the test-statistic. Smaller windows focus on responding quickly to severe drift, larger windows focus on ability to detect slight drift.
backend (str) – Backend used for the MMD implementation and configuration.
preprocess_fn (Optional[Callable]) – Function to preprocess the data before computing the data drift metrics.
x_ref_preprocessed (bool) – Whether the given reference data x_ref has been preprocessed yet. If x_ref_preprocessed=True, only the test data x will be preprocessed at prediction time. If x_ref_preprocessed=False, the reference data will also be preprocessed.
kernel (Optional[Callable]) – Kernel used for the MMD computation, defaults to Gaussian RBF kernel.
sigma (Optional[ndarray]) – Optionally set the GaussianRBF kernel bandwidth. Can also pass multiple bandwidth values as an array. The kernel evaluation is then averaged over those bandwidths. If sigma is not specified, the ‘median heuristic’ is adopted whereby sigma is set as the median pairwise distance between reference samples.
n_bootstraps (int) – The number of bootstrap simulations used to configure the thresholds. The larger this is the more accurately the desired ERT will be targeted. Should ideally be at least an order of magnitude larger than the ERT.
device (Union[Literal[‘cuda’, ‘gpu’, ‘cpu’], torch.device, None]) – Device type used. The default tries to use the GPU and falls back on CPU if needed. Can be specified by passing either 'cuda', 'gpu', 'cpu' or an instance of torch.device. Only relevant for ‘pytorch’ backend.
verbose (bool) – Whether or not to print progress during configuration.
input_shape (Optional[tuple]) – Shape of input data.
data_type (Optional[str]) – Optionally specify the data type (tabular, image or time-series). Added to metadata.

get_config()[source]

Get the detector’s configuration dictionary.

Return type:: dict
Returns:: The detector’s configuration dictionary.

load_state(filepath)[source]

Load the detector’s state from disk, in order to restart from a checkpoint previously generated with save_state.

Parameters:: filepath (Union[str, PathLike]) – The directory to load state from.

predict(x_t, return_test_stat=True)[source]

Predict whether the most recent window of data has drifted from the reference data.

Parameters:

x_t (Union[ndarray, Any]) – A single instance to be added to the test-window.
return_test_stat (bool) – Whether to return the test statistic (squared MMD) and threshold.

Return type:

Dict[Dict[str, str], Dict[str, Union[int, float]]]

Returns:

Dictionary containing 'meta' and 'data' dictionaries. –

'meta' has the model’s metadata.
'data' contains the drift prediction and optionally the test-statistic and threshold.

reset_state()[source]: Resets the detector to its initial state (t=0). This does not include reconfiguring thresholds.

save_state(filepath)[source]

Save a detector’s state to disk in order to generate a checkpoint.

Parameters:: filepath (Union[str, PathLike]) – The directory to save state to.

score(x_t)[source]

Compute the test-statistic (squared MMD) between the reference window and test window.

Parameters:: x_t (Union[ndarray, Any]) – A single instance to be added to the test-window.
Return type:: float
Returns:: Squared MMD estimate between reference window and test window.

property t

property test_stats

property thresholds

class alibi_detect.cd.RegressorUncertaintyDrift(x_ref, model, p_val=0.05, x_ref_preprocessed=False, backend=None, update_x_ref=None, uncertainty_type='mc_dropout', n_evals=25, batch_size=32, preprocess_batch_fn=None, device=None, tokenizer=None, max_len=None, input_shape=None, data_type=None)[source]

Bases: DriftConfigMixin

__init__(x_ref, model, p_val=0.05, x_ref_preprocessed=False, backend=None, update_x_ref=None, uncertainty_type='mc_dropout', n_evals=25, batch_size=32, preprocess_batch_fn=None, device=None, tokenizer=None, max_len=None, input_shape=None, data_type=None)[source]

Test for a change in the number of instances falling into regions on which the model is uncertain. Performs either a K-S test on uncertainties estimated from an preditive ensemble given either explicitly or implicitly as a model with dropout layers.

Parameters:

x_ref (Union[ndarray, list]) – Data used as reference distribution. Should be disjoint from the data the model was trained on for accurate p-values.
model (Callable) – Regression model outputting class probabilities (or logits)
backend (Optional[str]) – Backend to use if model requires batch prediction. Options are ‘tensorflow’ or ‘pytorch’.
p_val (float) – p-value used for the significance of the test.
x_ref_preprocessed (bool) – Whether the given reference data x_ref has been preprocessed yet. If x_ref_preprocessed=True, only the test data x will be preprocessed at prediction time. If x_ref_preprocessed=False, the reference data will also be preprocessed.
update_x_ref (Optional[Dict[str, int]]) – Reference data can optionally be updated to the last n instances seen by the detector or via reservoir sampling with size n. For the former, the parameter equals {‘last’: n} while for reservoir sampling {‘reservoir_sampling’: n} is passed.
uncertainty_type (str) – Method for determining the model’s uncertainty for a given instance. Options are ‘mc_dropout’ or ‘ensemble’. The former should output a scalar per instance. The latter should output a vector of predictions per instance.
n_evals (int) – The number of times to evaluate the model under different dropout configurations. Only relevant when using the ‘mc_dropout’ uncertainty type.
batch_size (int) – Batch size used to evaluate model. Only relevant when backend has been specified for batch prediction.
preprocess_batch_fn (Optional[Callable]) – Optional batch preprocessing function. For example to convert a list of objects to a batch which can be processed by the model.
device (Union[Literal[‘cuda’, ‘gpu’, ‘cpu’], torch.device, None]) – Device type used. The default tries to use the GPU and falls back on CPU if needed. Can be specified by passing either 'cuda', 'gpu', 'cpu' or an instance of torch.device. Only relevant for ‘pytorch’ backend.
tokenizer (Optional[Callable]) – Optional tokenizer for NLP models.
max_len (Optional[int]) – Optional max token length for NLP models.
input_shape (Optional[tuple]) – Shape of input data.
data_type (Optional[str]) – Optionally specify the data type (tabular, image or time-series). Added to metadata.

predict(x, return_p_val=True, return_distance=True)[source]

Predict whether a batch of data has drifted from the reference data.

Parameters:

x (Union[ndarray, list]) – Batch of instances.
return_p_val (bool) – Whether to return the p-value of the test.
return_distance (bool) – Whether to return the K-S test statistic

Return type:

Dict[Dict[str, str], Dict[str, Union[ndarray, int, float]]]

Returns:

Dictionary containing 'meta' and 'data' dictionaries. –

'meta' has the model’s metadata.
'data' contains the drift prediction and optionally the p-value, threshold and test statistic.

class alibi_detect.cd.SpotTheDiffDrift(x_ref, backend='tensorflow', p_val=0.05, x_ref_preprocessed=False, preprocess_fn=None, kernel=None, n_diffs=1, initial_diffs=None, l1_reg=0.01, binarize_preds=False, train_size=0.75, n_folds=None, retrain_from_scratch=True, seed=0, optimizer=None, learning_rate=0.001, batch_size=32, preprocess_batch_fn=None, epochs=3, verbose=0, train_kwargs=None, device=None, dataset=None, dataloader=None, input_shape=None, data_type=None)[source]

Bases: DriftConfigMixin

__init__(x_ref, backend='tensorflow', p_val=0.05, x_ref_preprocessed=False, preprocess_fn=None, kernel=None, n_diffs=1, initial_diffs=None, l1_reg=0.01, binarize_preds=False, train_size=0.75, n_folds=None, retrain_from_scratch=True, seed=0, optimizer=None, learning_rate=0.001, batch_size=32, preprocess_batch_fn=None, epochs=3, verbose=0, train_kwargs=None, device=None, dataset=None, dataloader=None, input_shape=None, data_type=None)[source]

Classifier-based drift detector with a classifier of form y = a + b_1*k(x,w_1) + … + b_J*k(x,w_J), where k is a kernel and w_1,…,w_J are learnable test locations. If drift has occured the test locations learn to be more/less (given by sign of b_i) similar to test instances than reference instances. The test locations are regularised to be close to the average reference instance such that the difference is then interpretable as the transformation required for each feature to make the average instance more/less like a test instance than a reference instance.

The classifier is trained on a fraction of the combined reference and test data and drift is detected on the remaining data. To use all the data to detect drift, a stratified cross-validation scheme can be chosen.

Parameters:

x_ref (Union[ndarray, list]) – Data used as reference distribution.
backend (str) – Backend used for the training loop implementation.
p_val (float) – p-value used for the significance of the test.
x_ref_preprocessed (bool) – Whether the given reference data x_ref has been preprocessed yet. If x_ref_preprocessed=True, only the test data x will be preprocessed at prediction time. If x_ref_preprocessed=False, the reference data will also be preprocessed.
preprocess_fn (Optional[Callable]) – Function to preprocess the data before computing the data drift metrics.
kernel (Callable) – Kernel used to define similarity between instances, defaults to Gaussian RBF
n_diffs (int) – The number of test locations to use, each corresponding to an interpretable difference.
initial_diffs (Optional[ndarray]) – Array used to initialise the diffs that will be learned. Defaults to Gaussian for each feature with equal variance to that of reference data.
l1_reg (float) – Strength of l1 regularisation to apply to the differences.
binarize_preds (bool) – Whether to test for discrepency on soft (e.g. probs/logits) model predictions directly with a K-S test or binarise to 0-1 prediction errors and apply a binomial test.
train_size (Optional[float]) – Optional fraction (float between 0 and 1) of the dataset used to train the classifier. The drift is detected on 1 - train_size. Cannot be used in combination with n_folds.
n_folds (Optional[int]) – Optional number of stratified folds used for training. The model preds are then calculated on all the out-of-fold instances. This allows to leverage all the reference and test data for drift detection at the expense of longer computation. If both train_size and n_folds are specified, n_folds is prioritized.
retrain_from_scratch (bool) – Whether the classifier should be retrained from scratch for each set of test data or whether it should instead continue training from where it left off on the previous set.
seed (int) – Optional random seed for fold selection.
optimizer (Optional[Callable]) – Optimizer used during training of the classifier.
learning_rate (float) – Learning rate used by optimizer.
batch_size (int) – Batch size used during training of the classifier.
preprocess_batch_fn (Optional[Callable]) – Optional batch preprocessing function. For example to convert a list of objects to a batch which can be processed by the model.
epochs (int) – Number of training epochs for the classifier for each (optional) fold.
verbose (int) – Verbosity level during the training of the classifier. 0 is silent, 1 a progress bar.
train_kwargs (Optional[dict]) – Optional additional kwargs when fitting the classifier.
device (Union[Literal[‘cuda’, ‘gpu’, ‘cpu’], torch.device, None]) – Device type used. The default tries to use the GPU and falls back on CPU if needed. Can be specified by passing either 'cuda', 'gpu', 'cpu' or an instance of torch.device. Only relevant for ‘pytorch’ backend.
dataset (Optional[Callable]) – Dataset object used during training.
dataloader (Optional[Callable]) – Dataloader object used during training. Only relevant for ‘pytorch’ backend.
input_shape (Optional[tuple]) – Shape of input data.
data_type (Optional[str]) – Optionally specify the data type (tabular, image or time-series). Added to metadata.

predict(x, return_p_val=True, return_distance=True, return_probs=True, return_model=True)[source]

Predict whether a batch of data has drifted from the reference data.

Parameters:

x (ndarray) – Batch of instances.
return_p_val (bool) – Whether to return the p-value of the test.
return_distance (bool) – Whether to return a notion of strength of the drift. K-S test stat if binarize_preds=False, otherwise relative error reduction.
return_probs (bool) – Whether to return the instance level classifier probabilities for the reference and test data (0=reference data, 1=test data).
return_model (bool) – Whether to return the updated model trained to discriminate reference and test instances.

Return type:

Dict[str, Dict[str, Union[str, int, float, Callable]]]

Returns:

Dictionary containing 'meta' and 'data' dictionaries. –

'meta' has the detector’s metadata.
'data' contains the drift prediction, the diffs used to distinguish reference from test instances, and optionally the p-value, performance of the classifier relative to its expectation under the no-change null, the out-of-fold classifier model prediction probabilities on the reference and test data as well as well as the associated reference and test instances of the out-of-fold predictions, and the trained model.

class alibi_detect.cd.TabularDrift(x_ref, p_val=0.05, categories_per_feature=None, x_ref_preprocessed=False, preprocess_at_init=True, update_x_ref=None, preprocess_fn=None, correction='bonferroni', alternative='two-sided', n_features=None, input_shape=None, data_type=None)[source]

Bases: BaseUnivariateDrift

__init__(x_ref, p_val=0.05, categories_per_feature=None, x_ref_preprocessed=False, preprocess_at_init=True, update_x_ref=None, preprocess_fn=None, correction='bonferroni', alternative='two-sided', n_features=None, input_shape=None, data_type=None)[source]

Mixed-type tabular data drift detector with Bonferroni or False Discovery Rate (FDR) correction for multivariate data. Kolmogorov-Smirnov (K-S) univariate tests are applied to continuous numerical data and Chi-Squared (Chi2) univariate tests to categorical data.

Parameters:

x_ref (Union[ndarray, list]) – Data used as reference distribution.
p_val (float) – p-value used for significance of the K-S and Chi2 test for each feature. If the FDR correction method is used, this corresponds to the acceptable q-value.
categories_per_feature (Optional[Dict[int, Optional[int]]]) – Dictionary with as keys the column indices of the categorical features and optionally as values the number of possible categorical values for that feature or a list with the possible values. If you know which features are categorical and simply want to infer the possible values of the categorical feature from the reference data you can pass a Dict[int, NoneType] such as {0: None, 3: None} if features 0 and 3 are categorical. If you also know how many categories are present for a given feature you could pass this in the categories_per_feature dict in the Dict[int, int] format, e.g. {0: 3, 3: 2}. If you pass N categories this will assume the possible values for the feature are [0, …, N-1]. You can also explicitly pass the possible categories in the Dict[int, List[int]] format, e.g. {0: [0, 1, 2], 3: [0, 55]}. Note that the categories can be arbitrary int values.
x_ref_preprocessed (bool) – Whether the given reference data x_ref has been preprocessed yet. If x_ref_preprocessed=True, only the test data x will be preprocessed at prediction time. If x_ref_preprocessed=False, the reference data will also be preprocessed.
preprocess_at_init (bool) – Whether to preprocess the reference data when the detector is instantiated. Otherwise, the reference data will be preprocessed at prediction time. Only applies if x_ref_preprocessed=False.
update_x_ref (Optional[Dict[str, int]]) – Reference data can optionally be updated to the last n instances seen by the detector or via reservoir sampling with size n. For the former, the parameter equals {‘last’: n} while for reservoir sampling {‘reservoir_sampling’: n} is passed.
preprocess_fn (Optional[Callable]) – Function to preprocess the data before computing the data drift metrics. Typically a dimensionality reduction technique.
correction (str) – Correction type for multivariate data. Either ‘bonferroni’ or ‘fdr’ (False Discovery Rate).
alternative (str) – Defines the alternative hypothesis for the K-S tests. Options are ‘two-sided’, ‘less’ or ‘greater’.
n_features (Optional[int]) – Number of features used in the combined K-S/Chi-Squared tests. No need to pass it if no preprocessing takes place. In case of a preprocessing step, this can also be inferred automatically but could be more expensive to compute.
input_shape (Optional[tuple]) – Shape of input data.
data_type (Optional[str]) – Optionally specify the data type (tabular, image or time-series). Added to metadata.

feature_score(x_ref, x)[source]

Compute K-S or Chi-Squared test statistics and p-values per feature.

Parameters:

x_ref (ndarray) – Reference instances to compare distribution with.
x (ndarray) – Batch of instances.

Return type:

Tuple[ndarray, ndarray]

Returns:

Feature level p-values and K-S or Chi-Squared statistics.

alibi_detect.cd package

Subpackages

Submodules