alibi_detect.od.seq2seq module

class alibi_detect.od.seq2seq.OutlierSeq2Seq(n_features, seq_len, threshold=None, seq2seq=None, threshold_net=None, latent_dim=None, output_activation=None, beta=1.0)[source]

Bases: alibi_detect.base.BaseDetector, alibi_detect.base.FitMixin, alibi_detect.base.ThresholdMixin

__init__(n_features, seq_len, threshold=None, seq2seq=None, threshold_net=None, latent_dim=None, output_activation=None, beta=1.0)[source]

Seq2Seq-based outlier detector.

Parameters
  • n_features (int) – Number of features in the time series.

  • seq_len (int) – Sequence length fed into the Seq2Seq model.

  • threshold (Union[float, ndarray, None]) – Threshold used for outlier detection. Can be a float or feature-wise array.

  • seq2seq (Optional[Model]) – A trained seq2seq model if available.

  • threshold_net (Optional[Sequential]) – Layers for the threshold estimation network wrapped in a tf.keras.Sequential class if no ‘seq2seq’ is specified.

  • latent_dim (Optional[int]) – Latent dimension of the encoder and decoder.

  • output_activation (Optional[str]) – Activation used in the Dense output layer of the decoder.

  • beta (float) – Weight on the threshold estimation loss term.

Return type

None

feature_score(X_orig, X_recon, threshold_est)[source]

Compute feature level outlier scores.

Parameters
  • X_orig (ndarray) – Original time series.

  • X_recon (ndarray) – Reconstructed time series.

  • threshold_est (ndarray) – Estimated threshold from the decoder’s latent space.

Return type

ndarray

Returns

Feature level outlier scores. Scores above 0 are outliers.

fit(X, loss_fn=tensorflow.keras.losses.mse, optimizer=tensorflow.keras.optimizers.Adam, epochs=20, batch_size=64, verbose=True, log_metric=None, callbacks=None)[source]

Train Seq2Seq model.

Parameters
  • X – Univariate or multivariate time series. Shape equals (batch, features) or (batch, sequence length, features).

  • loss_fn – Loss function used for training.

  • optimizer – Optimizer used for training.

  • epochs – Number of training epochs.

  • batch_size – Batch size used for training.

  • verbose – Whether to print training progress.

  • log_metric – Additional metrics whose progress will be displayed if verbose equals True.

  • callbacks – Callbacks used during training.

infer_threshold(X, outlier_perc=100.0, threshold_perc=95.0, batch_size=10000000000)[source]

Update the outlier threshold by using a sequence of instances from the dataset of which the fraction of features which are outliers are known. This fraction can be across all features or per feature.

Parameters
  • X (ndarray) – Univariate or multivariate time series.

  • outlier_perc (Union[int, float]) – Percentage of sorted feature level outlier scores used to predict instance level outlier.

  • threshold_perc (Union[int, float, ndarray, list]) – Percentage of X considered to be normal based on the outlier score. Overall (float) or feature-wise (array or list).

  • batch_size (int) – Batch size used when making predictions with the seq2seq model.

Return type

None

instance_score(fscore, outlier_perc=100.0)[source]

Compute instance level outlier scores. instance in this case means the data along the first axis of the original time series passed to the predictor.

Parameters
  • fscore (ndarray) – Feature level outlier scores.

  • outlier_perc (float) – Percentage of sorted feature level outlier scores used to predict instance level outlier.

Return type

ndarray

Returns

Instance level outlier scores.

predict(X, outlier_type='instance', outlier_perc=100.0, batch_size=10000000000, return_feature_score=True, return_instance_score=True)[source]

Compute outlier scores and transform into outlier predictions.

Parameters
  • X (ndarray) – Univariate or multivariate time series.

  • outlier_type (str) – Predict outliers at the ‘feature’ or ‘instance’ level.

  • outlier_perc (float) – Percentage of sorted feature level outlier scores used to predict instance level outlier.

  • batch_size (int) – Batch size used when making predictions with the seq2seq model.

  • return_feature_score (bool) – Whether to return feature level outlier scores.

  • return_instance_score (bool) – Whether to return instance level outlier scores.

Return type

Dict[Dict[str, str], Dict[ndarray, ndarray]]

Returns

  • Dictionary containing ‘meta’ and ‘data’ dictionaries.

  • ’meta’ has the model’s metadata.

  • ’data’ contains the outlier predictions and both feature and instance level outlier scores.

score(X, outlier_perc=100.0, batch_size=10000000000)[source]

Compute feature and instance level outlier scores.

Parameters
  • X (ndarray) – Univariate or multivariate time series.

  • outlier_perc (float) – Percentage of sorted feature level outlier scores used to predict instance level outlier.

  • batch_size (int) – Batch size used when making predictions with the seq2seq model.

Return type

Tuple[ndarray, ndarray]

Returns

Feature and instance level outlier scores.