alibi_detect.od.llr module

class alibi_detect.od.llr.LLR(threshold=None, model=None, model_background=None, log_prob=None, sequential=False, data_type=None)[source]

Bases: BaseDetector, FitMixin, ThresholdMixin

__init__(threshold=None, model=None, model_background=None, log_prob=None, sequential=False, data_type=None)[source]

Likelihood Ratios for Out-of-Distribution Detection. Ren, J. et al. NeurIPS 2019. https://arxiv.org/abs/1906.02845

Parameters:

threshold (Optional[float]) – Threshold used for the likelihood ratio (LLR) to determine outliers.
model (Union[Model, Distribution, PixelCNN, None]) – Generative model, defaults to PixelCNN.
model_background (Union[Model, Distribution, PixelCNN, None]) – Optional model for the background. Only needed if it is different from model.
log_prob (Optional[Callable]) – Function used to evaluate log probabilities under the model if the model does not have a log_prob function.
sequential (bool) – Whether the data is sequential. Used to create targets during training.
data_type (Optional[str]) – Optionally specify the data type (tabular, image or time-series). Added to metadata.

feature_score(X, batch_size=10000000000)[source]

Feature-level negative likelihood ratios.

Return type:: ndarray

fit(X, mutate_fn=<function mutate_categorical>, mutate_fn_kwargs={'feature_range': (0, 255), 'rate': 0.2, 'seed': 0}, mutate_batch_size=10000000000, loss_fn=None, loss_fn_kwargs=None, optimizer=tensorflow.keras.optimizers.Adam, epochs=20, batch_size=64, verbose=True, log_metric=None, callbacks=None)[source]

Train semantic and background generative models.

Parameters:

X – Training batch.
mutate_fn – Mutation function used to generate the background dataset.
mutate_fn_kwargs – Kwargs for the mutation function used to generate the background dataset. Default values set for an image dataset.
mutate_batch_size – Batch size used to generate the mutations for the background dataset.
loss_fn – Loss function used for training.
loss_fn_kwargs – Kwargs for loss function.
optimizer – Optimizer used for training.
epochs – Number of training epochs.
batch_size – Batch size used for training.
verbose – Whether to print training progress.
log_metric – Additional metrics whose progress will be displayed if verbose equals True.
callbacks – Callbacks used during training.

infer_threshold(X, outlier_type='instance', threshold_perc=95.0, batch_size=10000000000)[source]

Update LLR threshold by a value inferred from the percentage of instances considered to be outliers in a sample of the dataset.

Parameters:

X (ndarray) – Batch of instances.
outlier_type (str) – Predict outliers at the ‘feature’ or ‘instance’ level.
threshold_perc (float) – Percentage of sorted feature level outlier scores used to predict instance level outlier.
batch_size (int) – Batch size for the generative model evaluations.

Return type:

None

instance_score(X, batch_size=10000000000)[source]

Instance-level negative likelihood ratios.

Return type:: ndarray

llr(X, return_per_feature, batch_size=10000000000)[source]

Compute likelihood ratios.

Parameters:

X (ndarray) – Batch of instances.
return_per_feature (bool) – Return likelihood ratio per feature.
batch_size (int) – Batch size for the generative model evaluations.

Return type:

ndarray

Returns:

Likelihood ratios.

logp(dist, X, return_per_feature=False, batch_size=10000000000)[source]

Compute log probability of a batch of instances under the generative model.

Parameters:

dist – Distribution of the model.
X (ndarray) – Batch of instances.
return_per_feature (bool) – Return log probability per feature.
batch_size (int) – Batch size for the generative model evaluations.

Return type:

ndarray

Returns:

Log probabilities.

logp_alt(model, X, return_per_feature=False, batch_size=10000000000)[source]

Compute log probability of a batch of instances using the log_prob function defined by the user.

Parameters:

model (Model) – Trained model.
X (ndarray) – Batch of instances.
return_per_feature (bool) – Return log probability per feature.
batch_size (int) – Batch size for the generative model evaluations.

Return type:

ndarray

Returns:

Log probabilities.

predict(X, outlier_type='instance', batch_size=10000000000, return_feature_score=True, return_instance_score=True)[source]

Predict whether instances are outliers or not.

Parameters:

X (ndarray) – Batch of instances.
outlier_type (str) – Predict outliers at the ‘feature’ or ‘instance’ level.
batch_size (int) – Batch size used when making predictions with the generative model.
return_feature_score (bool) – Whether to return feature level outlier scores.
return_instance_score (bool) – Whether to return instance level outlier scores.

Return type:

Dict[Dict[str, str], Dict[ndarray, ndarray]]

Returns:

Dictionary containing 'meta' and 'data' dictionaries. –

'meta' has the model’s metadata.
'data' contains the outlier predictions and both feature and instance level outlier scores.

score(X, batch_size=10000000000)[source]

Feature-level and instance-level outlier scores. The scores are equal to the negative likelihood ratios.

Return type:: Tuple[ndarray, ndarray]

alibi_detect.od.llr.build_model(dist, input_shape=None, filepath=None)[source]

Create tf.keras.Model from TF distribution.

Parameters:

dist (Union[Distribution, PixelCNN]) – TensorFlow distribution.
input_shape (Optional[tuple]) – Input shape of the model.
filepath (Optional[str]) – File to load model weights from.

Return type:

Tuple[Model, Union[Distribution, PixelCNN]]

Returns:

TensorFlow model.