This page was generated from doc/source/cd/methods/modeluncdrift.ipynb.

# Model Uncertainty¶

## Overview¶

Model-uncertainty drift detectors aim to directly detect drift that’s likely to effect the performance of a model of interest. The approach is to test for change in the number of instances falling into regions of the input space on which the model is uncertain in its predictions. For each instance in the reference set the detector obtains the model’s prediction and some associated notion of uncertainty. For example for a classifier this may be the entropy of the predicted label probabilities or for a regressor with dropout layers dropout Monte Carlo can be used to provide a notion of uncertainty. The same is done for the test set and if significant differences in uncertainty are detected (via a Kolmogorov-Smirnoff test) then drift is flagged. The detector’s reference set should be disjoint from the model’s training set (on which the model’s confidence may be higher).

`ClassifierUncertaintyDrift`

should be used with classification models whereas `RegressorUncertaintyDrift`

should be used with regression models. They are used in much the same way.

By default `ClassifierUncertaintyDrift`

uses `uncertainty_type='entropy'`

as the notion of uncertainty for classifier predictions and a Kolmogorov-Smirnov two-sample test is performed on these continuous values. However `uncertainty_type='margin'`

can also be specified to deem the classifier’s prediction uncertain if they fall within a margin (e.g. in [0.45,0.55] for binary classifier probabilities) (similar to Sethi and
Kantardzic (2017)) and a Chi-Squared two-sample test is performed on these 0-1 flags of uncertainty.

By default `RegressorUncertaintyDrift`

uses `uncertainty_type='mc_dropout'`

and assumes a PyTorch or TensorFlow model with dropout layers as the regressor. This evaluates the model under multiple dropout configurations and uses the variation as the notion of uncertainty. Alternatively a model that outputs (for each instance) a vector of independent model predictions can be passed and `uncertainty_type='ensemble'`

can be specified. Again the variation is taken as the notion of uncertainty
and in both cases a Kolmogorov-Smirnov two-sample test is performed on the continuous notions of uncertainty.

## Usage¶

### Initialize¶

Arguments:

`x_ref`

: Data used as reference distribution. Should be disjoint from the model’s training set`model`

: The model of interest whose performance we’d like to remain constant.

Keyword arguments:

`p_val`

: p-value used for the significance of the test.`update_x_ref`

: Reference data can optionally be updated to the last N instances seen by the detector or via reservoir sampling with size N. For the former, the parameter equals*{‘last’: N}*while for reservoir sampling*{‘reservoir_sampling’: N}*is passed.`data_type`

: Optionally specify the data type (e.g. tabular, image or time-series). Added to metadata.

`ClassifierUncertaintyDrift`

-specific keyword arguments:

`preds_type`

: Type of prediction output by the model. Options are ‘probs’ (in [0,1]) or ‘logits’ (in [-inf,inf]).`uncertainty_type`

: Method for determining the model’s uncertainty for a given instance. Options are ‘entropy’ or ‘margin’.`margin_width`

: Width of the margin if uncertainty_type = ‘margin’. The model is considered uncertain on an instance if the highest two class probabilities it assigns to the instance differ by less than this.

`RegressorUncertaintyDrift`

-specific keyword arguments:

`uncertainty_type`

: Method for determining the model’s uncertainty for a given instance. Options are ‘mc_dropout’ or ‘ensemble’. For the former the model should have dropout layers and output a scalar per instance. For the latter the model should output a vector of predictions per instance.`n_evals`

: The number of times to evaluate the model under different dropout configurations. Only relavent when using the ‘mc_dropout’ uncertainty type.

Additional arguments if batch prediction required:

`backend`

: Framework that was used to define model. Options are ‘tensorflow’ or ‘pytorch’.`batch_size`

: Batch size to use to evaluate model. Defaults to 32.`device`

: Device type to use. The default None tries to use the GPU and falls back on CPU if needed. Can be specified by passing either ‘cuda’, ‘gpu’ or ‘cpu’. Only relevant for ‘pytorch’ backend.

Additional arguments for NLP models

`tokenizer`

: Tokenizer to use before passing data to model.`max_len`

: Max length to be used by tokenizer.

### Examples¶

Drift detector for a **TensorFlow** classifier outputting probabilities:

```
from alibi_detect.cd import ClassifierUncertaintyDrift
clf = # tensorflow classifier model
cd = ClassifierUncertaintyDetector(x_ref, clf, backend='tensorflow', p_val=.05, preds_type='probs')
```

Drift detector for a **PyTorch** regressor (with dropout layers) outputting scalars:

```
from alibi_detect.cd import RegressorUncertaintyDrift
reg = # pytorch regression model with at least 1 dropout layer
cd = RegressorUncertaintyDrift(x_ref, reg, backend='pytorch', p_val=.05, uncertainty_type='mc_dropout')
```

Note that for the PyTorch RegressorUncertaintyDrift detector the dropout layers need to be defined within the `nn.Module`

init to be able to set them to train mode when computing the uncertainty estimates, e.g.:

```
class Model(nn.Module):
def __init__(self) -> None:
super().__init__()
# define model
self.dropout = nn.Dropout(p=.5)
def forward(self, x: torch.Tensor) -> torch.Tensor:
# do forward pass which includes self.dropout
```

### Detect Drift¶

We detect data drift by simply calling `predict`

on a batch of instances `x`

. `return_p_val`

equal to *True* will also return the p-value of the test and `return_distance`

equal to *True* will return the test-statistic.

The prediction takes the form of a dictionary with `meta`

and `data`

keys. `meta`

contains the detector’s metadata while `data`

is also a dictionary which contains the actual predictions stored in the following keys:

`is_drift`

: 1 if the sample tested has drifted from the reference data and 0 otherwise.`threshold`

: the user-defined threshold defining the significance of the test.`p_val`

: the p-value of the test if`return_p_val`

equals*True*.`distance`

: the test-statistic if`return_distance`

equals*True*.

```
preds = cd.predict(x)
```