alibi_detect.od.prophet module

class alibi_detect.od.prophet.OutlierProphet(threshold=0.8, growth='linear', cap=None, holidays=None, holidays_prior_scale=10.0, country_holidays=None, changepoint_prior_scale=0.05, changepoint_range=0.8, seasonality_mode='additive', daily_seasonality='auto', weekly_seasonality='auto', yearly_seasonality='auto', add_seasonality=None, seasonality_prior_scale=10.0, uncertainty_samples=1000, mcmc_samples=0)[source]

Bases: BaseDetector, FitMixin

__init__(threshold=0.8, growth='linear', cap=None, holidays=None, holidays_prior_scale=10.0, country_holidays=None, changepoint_prior_scale=0.05, changepoint_range=0.8, seasonality_mode='additive', daily_seasonality='auto', weekly_seasonality='auto', yearly_seasonality='auto', add_seasonality=None, seasonality_prior_scale=10.0, uncertainty_samples=1000, mcmc_samples=0)[source]

Outlier detector for time series data using fbprophet. See https://facebook.github.io/prophet/ for more details.

Parameters:

threshold (float) – Width of the uncertainty intervals of the forecast, used as outlier threshold. Equivalent to interval_width. If the instance lies outside of the uncertainty intervals, it is flagged as an outlier. If mcmc_samples equals 0, it is the uncertainty in the trend using the MAP estimate of the extrapolated model. If mcmc_samples >0, then uncertainty over all parameters is used.
growth (str) – ‘linear’ or ‘logistic’ to specify a linear or logistic trend.
cap (Optional[float]) – Growth cap in case growth equals ‘logistic’.
holidays (Optional[DataFrame]) – pandas DataFrame with columns holiday (string) and ds (dates) and optionally columns lower_window and upper_window which specify a range of days around the date to be included as holidays.
holidays_prior_scale (float) – Parameter controlling the strength of the holiday components model. Higher values imply a more flexible trend, more prone to more overfitting.
country_holidays (Optional[str]) – Include country-specific holidays via country abbreviations. The holidays for each country are provided by the holidays package in Python. A list of available countries and the country name to use is available on: https://github.com/dr-prodigy/python-holidays. Additionally, Prophet includes holidays for: Brazil (BR), Indonesia (ID), India (IN), Malaysia (MY), Vietnam (VN), Thailand (TH), Philippines (PH), Turkey (TU), Pakistan (PK), Bangladesh (BD), Egypt (EG), China (CN) and Russian (RU).
changepoint_prior_scale (float) – Parameter controlling the flexibility of the automatic changepoint selection. Large values will allow many changepoints, potentially leading to overfitting.
changepoint_range (float) – Proportion of history in which trend changepoints will be estimated. Higher values means more changepoints, potentially leading to overfitting.
seasonality_mode (str) – Either ‘additive’ or ‘multiplicative’.
daily_seasonality (Union[str, bool, int]) – Can be ‘auto’, True, False, or a number of Fourier terms to generate.
weekly_seasonality (Union[str, bool, int]) – Can be ‘auto’, True, False, or a number of Fourier terms to generate.
yearly_seasonality (Union[str, bool, int]) – Can be ‘auto’, True, False, or a number of Fourier terms to generate.
add_seasonality (Optional[List]) – Manually add one or more seasonality components. Pass a list of dicts containing the keys name, period, fourier_order (obligatory), prior_scale and mode (optional).
seasonality_prior_scale (float) – Parameter controlling the strength of the seasonality model. Larger values allow the model to fit larger seasonal fluctuations, potentially leading to overfitting.
uncertainty_samples (int) – Number of simulated draws used to estimate uncertainty intervals.
mcmc_samples (int) – If >0, will do full Bayesian inference with the specified number of MCMC samples. If 0, will do MAP estimation.

fit(df)[source]

Fit Prophet model on normal (inlier) data.

Parameters:: df (DataFrame) – Dataframe with columns ds with timestamps and y with target values.
Return type:: None

predict(df, return_instance_score=True, return_forecast=True)[source]

Compute outlier scores and transform into outlier predictions.

Parameters:

df (DataFrame) – DataFrame with columns ds with timestamps and y with values which need to be flagged as outlier or not.
return_instance_score (bool) – Whether to return instance level outlier scores.
return_forecast (bool) – Whether to return the model forecast.

Return type:

Dict[Dict[str, str], Dict[DataFrame, DataFrame]]

Returns:

Dictionary containing 'meta' and 'data' dictionaries. –

'meta' has the model’s metadata.
'data' contains the outlier predictions, instance level outlier scores and the model forecast.

score(df)[source]

Compute outlier scores.

Parameters:: df (DataFrame) – DataFrame with columns ds with timestamps and y with values which need to be flagged as outlier or not.
Return type:: DataFrame
Returns:: Array with outlier scores for each instance in the batch.