alibi_detect.ad package

class alibi_detect.ad.AdversarialVAE(threshold=None, vae=None, model=None, encoder_net=None, decoder_net=None, latent_dim=None, samples=10, beta=0.0, data_type=None)[source]

Bases: alibi_detect.base.BaseDetector, alibi_detect.base.FitMixin, alibi_detect.base.ThresholdMixin

__init__(threshold=None, vae=None, model=None, encoder_net=None, decoder_net=None, latent_dim=None, samples=10, beta=0.0, data_type=None)[source]

VAE-based adversarial detector.

Parameters
  • threshold (Optional[float]) – Threshold used for adversarial score to determine adversarial instances.

  • vae (Optional[tensorflow.keras.Model]) – A trained tf.keras model if available.

  • model (Optional[tensorflow.keras.Model]) – A trained tf.keras classification model.

  • encoder_net (Optional[tensorflow.keras.Sequential]) – Layers for the encoder wrapped in a tf.keras.Sequential class if no ‘vae’ is specified.

  • decoder_net (Optional[tensorflow.keras.Sequential]) – Layers for the decoder wrapped in a tf.keras.Sequential class if no ‘vae’ is specified.

  • latent_dim (Optional[int]) – Dimensionality of the latent space.

  • samples (int) – Number of samples sampled to evaluate each instance.

  • beta (float) – Beta parameter for KL-divergence loss term.

  • data_type (Optional[str]) – Optionally specifiy the data type (tabular, image or time-series). Added to metadata.

Return type

None

fit(X, loss_fn=<function loss_adv_vae>, w_model=1.0, w_recon=0.0, optimizer=tensorflow.keras.optimizers.Adam, cov_elbo=None, epochs=20, batch_size=64, verbose=True, log_metric=None, callbacks=None)[source]

Train Adversarial VAE model.

Parameters
  • X (numpy.ndarray) – Training batch.

  • loss_fn (tensorflow.keras.losses) – Loss function used for training.

  • w_model (float) – Weight on model prediction loss term.

  • w_recon (float) – Weight on elbo loss term.

  • optimizer (tensorflow.keras.optimizers) – Optimizer used for training.

  • cov_elbo (Optional[dict]) – Dictionary with covariance matrix options in case the elbo loss function is used. Either use the full covariance matrix inferred from X (dict(cov_full=None)), only the variance (dict(cov_diag=None)) or a float representing the same standard deviation for each feature (e.g. dict(sim=.05)).

  • epochs (int) – Number of training epochs.

  • batch_size (int) – Batch size used for training.

  • verbose (bool) – Whether to print training progress.

  • log_metric (Optional[Tuple[str, tensorflow.keras.metrics]]) – Additional metrics whose progress will be displayed if verbose equals True.

  • callbacks (Optional[tensorflow.keras.callbacks]) – Callbacks used during training.

Return type

None

infer_threshold(X, threshold_perc=95.0)[source]

Update threshold by a value inferred from the percentage of instances considered to be adversarial in a sample of the dataset.

Parameters
  • X (numpy.ndarray) – Batch of instances.

  • threshold_perc (float) – Percentage of X considered to be normal based on the adversarial score.

Return type

None

predict(X, return_instance_score=True)[source]

Predict whether instances are adversarial instances or not.

Parameters
  • X (numpy.ndarray) – Batch of instances.

  • return_instance_score (bool) – Whether to return instance level adversarial scores.

Return type

Dict[Dict[str, str], Dict[numpy.ndarray, numpy.ndarray]]

Returns

  • Dictionary containing ‘meta’ and ‘data’ dictionaries.

  • ’meta’ has the model’s metadata.

  • ’data’ contains the adversarial predictions and instance level adversarial scores.

score(X)[source]

Compute adversarial scores.

Parameters

X (numpy.ndarray) – Batch of instances to analyze.

Return type

numpy.ndarray

Returns

Array with adversarial scores for each instance in the batch.