alibi_detect.od.sr module

class alibi_detect.od.sr.Padding(value)[source]

Bases: str, Enum

An enumeration.

CONSTANT = 'constant'

REFLECT = 'reflect'

REPLICATE = 'replicate'

__format__(format_spec): Returns format using actual value type unless __str__ has been overridden.

class alibi_detect.od.sr.Side(value)[source]

Bases: str, Enum

An enumeration.

BILATERAL = 'bilateral'

LEFT = 'left'

RIGHT = 'right'

__format__(format_spec): Returns format using actual value type unless __str__ has been overridden.

class alibi_detect.od.sr.SpectralResidual(threshold=None, window_amp=None, window_local=None, padding_amp_method='reflect', padding_local_method='reflect', padding_amp_side='bilateral', n_est_points=None, n_grad_points=5)[source]

Bases: BaseDetector, ThresholdMixin

__init__(threshold=None, window_amp=None, window_local=None, padding_amp_method='reflect', padding_local_method='reflect', padding_amp_side='bilateral', n_est_points=None, n_grad_points=5)[source]

Outlier detector for time-series data using the spectral residual algorithm. Based on “Time-Series Anomaly Detection Service at Microsoft” (Ren et al., 2019) https://arxiv.org/abs/1906.03821.

Parameters:

threshold (Optional[float]) – Threshold used to classify outliers. Relative saliency map distance from the moving average.
window_amp (Optional[int]) – Window for the average log amplitude.
window_local (Optional[int]) – Window for the local average of the saliency map. Note that the averaging is performed over the previous window_local data points (i.e., is a local average of the preceding window_local points for the current index).
padding_amp_method (Literal[‘constant’, ‘replicate’, ‘reflect’]) –
Padding method to be used prior to each convolution over log amplitude. Possible values: constant | replicate | reflect. Default value: replicate.
- constant - padding with constant 0.
- replicate - repeats the last/extreme value.
- reflect - reflects the time series.
padding_local_method (Literal[‘constant’, ‘replicate’, ‘reflect’]) –
Padding method to be used prior to each convolution over saliency map. Possible values: constant | replicate | reflect. Default value: replicate.
- constant - padding with constant 0.
- replicate - repeats the last/extreme value.
- reflect - reflects the time series.
padding_amp_side (Literal[‘bilateral’, ‘left’, ‘right’]) – Whether to pad the amplitudes on both sides or only on one side. Possible values: bilateral | left | right.
n_est_points (Optional[int]) – Number of estimated points padded to the end of the sequence.
n_grad_points (int) – Number of points used for the gradient estimation of the additional points padded to the end of the sequence.

add_est_points(X, t)[source]

Pad the time series with additional points since the method works better if the anomaly point is towards the center of the sliding window.

Parameters:

X (ndarray) – Uniformly sampled time series instances.
t (ndarray) – Equidistant timestamps corresponding to each input instances (i.e, the array should contain numerical values in increasing order).

Return type:

ndarray

Returns:

Padded version of X.

compute_grads(X, t)[source]

Slope of the straight line between different points of the time series multiplied by the average time step size.

Parameters:

X (ndarray) – Uniformly sampled time series instances.
t (ndarray) – Equidistant timestamps corresponding to each input instances (i.e, the array should contain numerical values in increasing order).

Return type:

ndarray

Returns:

Array with slope values.

infer_threshold(X, t=None, threshold_perc=95.0)[source]

Update threshold by a value inferred from the percentage of instances considered to be outliers in a sample of the dataset.

Parameters:

X (ndarray) – Uniformly sampled time series instances.
t (Optional[ndarray]) – Equidistant timestamps corresponding to each input instances (i.e, the array should contain numerical values in increasing order). If not provided, the timestamps will be replaced by an array of integers [0, 1, … , N - 1], where N is the size of the input time series.
threshold_perc (float) – Percentage of X considered to be normal based on the outlier score.

Return type:

None

static pad_same(X, W, method='replicate', side='bilateral')[source]

Adds padding to the time series X such that after applying a valid convolution with a kernel/filter w, the resulting time series has the same shape as the input X.

Parameters:

X (ndarray) – Time series to be padded
W (ndarray) – Convolution kernel/filter.
method (str) –
Padding method to be used. Possible values:
- constant - padding with constant 0.
- replicate - repeats the last/extreme value.
- reflect - reflects the time series.
side (str) –
Whether to pad the time series bilateral or only on one side. Possible values:
- bilateral - time series is padded on both sides.
- left - time series is padded only on the left hand side.
- right - time series is padded only on the right hand side.

Return type:

ndarray

Returns:

Padded time series.

predict(X, t=None, return_instance_score=True)[source]

Compute outlier scores and transform into outlier predictions.

Parameters:

X (ndarray) – Uniformly sampled time series instances.
t (Optional[ndarray]) – Equidistant timestamps corresponding to each input instances (i.e, the array should contain numerical values in increasing order). If not provided, the timestamps will be replaced by an array of integers [0, 1, … , N - 1], where N is the size of the input time series.
return_instance_score (bool) – Whether to return instance level outlier scores.

Return type:

Dict[Dict[str, str], Dict[ndarray, ndarray]]

Returns:

Dictionary containing meta and data dictionaries. –

meta - has the model’s metadata.
data - contains the outlier predictions and instance level outlier scores.

saliency_map(X)[source]

Compute saliency map.

Parameters:: X (ndarray) – Uniformly sampled time series instances.
Return type:: ndarray
Returns:: Array with saliency map values.

score(X, t=None)[source]

Compute outlier scores.

Parameters:

X (ndarray) – Uniformly sampled time series instances.
t (Optional[ndarray]) – Equidistant timestamps corresponding to each input instances (i.e, the array should contain numerical values in increasing order). If not provided, the timestamps will be replaced by an array of integers [0, 1, … , N - 1], where N is the size of the input time series.

Return type:

ndarray

Returns:

Array with outlier scores for each instance in the batch.