alibi_detect.od.vae module
- class alibi_detect.od.vae.OutlierVAE(threshold=None, score_type='mse', vae=None, encoder_net=None, decoder_net=None, latent_dim=None, samples=10, beta=1.0, data_type=None)[source]
Bases:
BaseDetector
,FitMixin
,ThresholdMixin
- __init__(threshold=None, score_type='mse', vae=None, encoder_net=None, decoder_net=None, latent_dim=None, samples=10, beta=1.0, data_type=None)[source]
VAE-based outlier detector.
- Parameters:
threshold (
Optional
[float
]) – Threshold used for outlier score to determine outliers.score_type (
str
) – Metric used for outlier scores. Either ‘mse’ (mean squared error) or ‘proba’ (reconstruction probabilities) supported.vae (
Optional
[Model
]) – A trained tf.keras model if available.encoder_net (
Optional
[Model
]) – Layers for the encoder wrapped in a tf.keras.Sequential class if no ‘vae’ is specified.decoder_net (
Optional
[Model
]) – Layers for the decoder wrapped in a tf.keras.Sequential class if no ‘vae’ is specified.latent_dim (
Optional
[int
]) – Dimensionality of the latent space.samples (
int
) – Number of samples sampled to evaluate each instance.beta (
float
) – Beta parameter for KL-divergence loss term.data_type (
Optional
[str
]) – Optionally specify the data type (tabular, image or time-series). Added to metadata.
- feature_score(X_orig, X_recon)[source]
Compute feature level outlier scores.
- Parameters:
X_orig (
ndarray
) – Batch of original instances.X_recon (
ndarray
) – Batch of reconstructed instances.
- Return type:
ndarray
- Returns:
Feature level outlier scores.
- fit(X, loss_fn=<function elbo>, optimizer=tensorflow.keras.optimizers.Adam, cov_elbo={'sim': 0.05}, epochs=20, batch_size=64, verbose=True, log_metric=None, callbacks=None)[source]
Train VAE model.
- Parameters:
X – Training batch.
loss_fn – Loss function used for training.
optimizer – Optimizer used for training.
cov_elbo – Dictionary with covariance matrix options in case the elbo loss function is used. Either use the full covariance matrix inferred from X (dict(cov_full=None)), only the variance (dict(cov_diag=None)) or a float representing the same standard deviation for each feature (e.g. dict(sim=.05)).
epochs – Number of training epochs.
batch_size – Batch size used for training.
verbose – Whether to print training progress.
log_metric – Additional metrics whose progress will be displayed if verbose equals True.
callbacks – Callbacks used during training.
- infer_threshold(X, outlier_type='instance', outlier_perc=100.0, threshold_perc=95.0, batch_size=10000000000)[source]
Update threshold by a value inferred from the percentage of instances considered to be outliers in a sample of the dataset.
- Parameters:
X (
ndarray
) – Batch of instances.outlier_type (
str
) – Predict outliers at the ‘feature’ or ‘instance’ level.outlier_perc (
float
) – Percentage of sorted feature level outlier scores used to predict instance level outlier.threshold_perc (
float
) – Percentage of X considered to be normal based on the outlier score.batch_size (
int
) – Batch size used when making predictions with the VAE.
- Return type:
- instance_score(fscore, outlier_perc=100.0)[source]
Compute instance level outlier scores.
- Parameters:
fscore (
ndarray
) – Feature level outlier scores.outlier_perc (
float
) – Percentage of sorted feature level outlier scores used to predict instance level outlier.
- Return type:
ndarray
- Returns:
Instance level outlier scores.
- predict(X, outlier_type='instance', outlier_perc=100.0, batch_size=10000000000, return_feature_score=True, return_instance_score=True)[source]
Predict whether instances are outliers or not.
- Parameters:
X (
ndarray
) – Batch of instances.outlier_type (
str
) – Predict outliers at the ‘feature’ or ‘instance’ level.outlier_perc (
float
) – Percentage of sorted feature level outlier scores used to predict instance level outlier.batch_size (
int
) – Batch size used when making predictions with the VAE.return_feature_score (
bool
) – Whether to return feature level outlier scores.return_instance_score (
bool
) – Whether to return instance level outlier scores.
- Return type:
- Returns:
Dictionary containing
'meta'
and'data'
dictionaries. –'meta'
has the model’s metadata.'data'
contains the outlier predictions and both feature and instance level outlier scores.