alibi_detect.od.sr module
- class alibi_detect.od.sr.Padding(value)[source]
-
An enumeration.
- CONSTANT = 'constant'
- REFLECT = 'reflect'
- REPLICATE = 'replicate'
- __format__(format_spec)
Returns format using actual value type unless __str__ has been overridden.
- class alibi_detect.od.sr.Side(value)[source]
-
An enumeration.
- BILATERAL = 'bilateral'
- LEFT = 'left'
- RIGHT = 'right'
- __format__(format_spec)
Returns format using actual value type unless __str__ has been overridden.
- class alibi_detect.od.sr.SpectralResidual(threshold=None, window_amp=None, window_local=None, padding_amp_method='reflect', padding_local_method='reflect', padding_amp_side='bilateral', n_est_points=None, n_grad_points=5)[source]
Bases:
BaseDetector
,ThresholdMixin
- __init__(threshold=None, window_amp=None, window_local=None, padding_amp_method='reflect', padding_local_method='reflect', padding_amp_side='bilateral', n_est_points=None, n_grad_points=5)[source]
Outlier detector for time-series data using the spectral residual algorithm. Based on “Time-Series Anomaly Detection Service at Microsoft” (Ren et al., 2019) https://arxiv.org/abs/1906.03821.
- Parameters:
threshold (
Optional
[float
]) – Threshold used to classify outliers. Relative saliency map distance from the moving average.window_amp (
Optional
[int
]) – Window for the average log amplitude.window_local (
Optional
[int
]) – Window for the local average of the saliency map. Note that the averaging is performed over the previous window_local data points (i.e., is a local average of the preceding window_local points for the current index).padding_amp_method (
Literal
['constant'
,'replicate'
,'reflect'
]) –Padding method to be used prior to each convolution over log amplitude. Possible values: constant | replicate | reflect. Default value: replicate.
constant - padding with constant 0.
replicate - repeats the last/extreme value.
reflect - reflects the time series.
padding_local_method (
Literal
['constant'
,'replicate'
,'reflect'
]) –Padding method to be used prior to each convolution over saliency map. Possible values: constant | replicate | reflect. Default value: replicate.
constant - padding with constant 0.
replicate - repeats the last/extreme value.
reflect - reflects the time series.
padding_amp_side (
Literal
['bilateral'
,'left'
,'right'
]) – Whether to pad the amplitudes on both sides or only on one side. Possible values: bilateral | left | right.n_est_points (
Optional
[int
]) – Number of estimated points padded to the end of the sequence.n_grad_points (
int
) – Number of points used for the gradient estimation of the additional points padded to the end of the sequence.
- add_est_points(X, t)[source]
Pad the time series with additional points since the method works better if the anomaly point is towards the center of the sliding window.
- Parameters:
X (
ndarray
) – Uniformly sampled time series instances.t (
ndarray
) – Equidistant timestamps corresponding to each input instances (i.e, the array should contain numerical values in increasing order).
- Return type:
ndarray
- Returns:
Padded version of X.
- compute_grads(X, t)[source]
Slope of the straight line between different points of the time series multiplied by the average time step size.
- Parameters:
X (
ndarray
) – Uniformly sampled time series instances.t (
ndarray
) – Equidistant timestamps corresponding to each input instances (i.e, the array should contain numerical values in increasing order).
- Return type:
ndarray
- Returns:
Array with slope values.
- infer_threshold(X, t=None, threshold_perc=95.0)[source]
Update threshold by a value inferred from the percentage of instances considered to be outliers in a sample of the dataset.
- Parameters:
X (
ndarray
) – Uniformly sampled time series instances.t (
Optional
[ndarray
]) – Equidistant timestamps corresponding to each input instances (i.e, the array should contain numerical values in increasing order). If not provided, the timestamps will be replaced by an array of integers [0, 1, … , N - 1], where N is the size of the input time series.threshold_perc (
float
) – Percentage of X considered to be normal based on the outlier score.
- Return type:
- static pad_same(X, W, method='replicate', side='bilateral')[source]
Adds padding to the time series X such that after applying a valid convolution with a kernel/filter w, the resulting time series has the same shape as the input X.
- Parameters:
X (
ndarray
) – Time series to be paddedW (
ndarray
) – Convolution kernel/filter.method (
str
) –Padding method to be used. Possible values:
constant - padding with constant 0.
replicate - repeats the last/extreme value.
reflect - reflects the time series.
side (
str
) –Whether to pad the time series bilateral or only on one side. Possible values:
bilateral - time series is padded on both sides.
left - time series is padded only on the left hand side.
right - time series is padded only on the right hand side.
- Return type:
ndarray
- Returns:
Padded time series.
- predict(X, t=None, return_instance_score=True)[source]
Compute outlier scores and transform into outlier predictions.
- Parameters:
X (
ndarray
) – Uniformly sampled time series instances.t (
Optional
[ndarray
]) – Equidistant timestamps corresponding to each input instances (i.e, the array should contain numerical values in increasing order). If not provided, the timestamps will be replaced by an array of integers [0, 1, … , N - 1], where N is the size of the input time series.return_instance_score (
bool
) – Whether to return instance level outlier scores.
- Return type:
- Returns:
Dictionary containing meta and data dictionaries. –
meta - has the model’s metadata.
data - contains the outlier predictions and instance level outlier scores.
- saliency_map(X)[source]
Compute saliency map.
- Parameters:
X (
ndarray
) – Uniformly sampled time series instances.- Return type:
ndarray
- Returns:
Array with saliency map values.
- score(X, t=None)[source]
Compute outlier scores.
- Parameters:
X (
ndarray
) – Uniformly sampled time series instances.t (
Optional
[ndarray
]) – Equidistant timestamps corresponding to each input instances (i.e, the array should contain numerical values in increasing order). If not provided, the timestamps will be replaced by an array of integers [0, 1, … , N - 1], where N is the size of the input time series.
- Return type:
ndarray
- Returns:
Array with outlier scores for each instance in the batch.