alibi_detect.od.prophet module

class alibi_detect.od.prophet.OutlierProphet(threshold=0.8, growth='linear', cap=None, holidays=None, holidays_prior_scale=10.0, country_holidays=None, changepoint_prior_scale=0.05, changepoint_range=0.8, seasonality_mode='additive', daily_seasonality='auto', weekly_seasonality='auto', yearly_seasonality='auto', add_seasonality=None, seasonality_prior_scale=10.0, uncertainty_samples=1000, mcmc_samples=0)[source]

Bases: BaseDetector, FitMixin

__init__(threshold=0.8, growth='linear', cap=None, holidays=None, holidays_prior_scale=10.0, country_holidays=None, changepoint_prior_scale=0.05, changepoint_range=0.8, seasonality_mode='additive', daily_seasonality='auto', weekly_seasonality='auto', yearly_seasonality='auto', add_seasonality=None, seasonality_prior_scale=10.0, uncertainty_samples=1000, mcmc_samples=0)[source]

Outlier detector for time series data using fbprophet. See https://facebook.github.io/prophet/ for more details.

Parameters:
  • threshold (float) – Width of the uncertainty intervals of the forecast, used as outlier threshold. Equivalent to interval_width. If the instance lies outside of the uncertainty intervals, it is flagged as an outlier. If mcmc_samples equals 0, it is the uncertainty in the trend using the MAP estimate of the extrapolated model. If mcmc_samples >0, then uncertainty over all parameters is used.

  • growth (str) – ‘linear’ or ‘logistic’ to specify a linear or logistic trend.

  • cap (Optional[float]) – Growth cap in case growth equals ‘logistic’.

  • holidays (Optional[DataFrame]) – pandas DataFrame with columns holiday (string) and ds (dates) and optionally columns lower_window and upper_window which specify a range of days around the date to be included as holidays.

  • holidays_prior_scale (float) – Parameter controlling the strength of the holiday components model. Higher values imply a more flexible trend, more prone to more overfitting.

  • country_holidays (Optional[str]) – Include country-specific holidays via country abbreviations. The holidays for each country are provided by the holidays package in Python. A list of available countries and the country name to use is available on: https://github.com/dr-prodigy/python-holidays. Additionally, Prophet includes holidays for: Brazil (BR), Indonesia (ID), India (IN), Malaysia (MY), Vietnam (VN), Thailand (TH), Philippines (PH), Turkey (TU), Pakistan (PK), Bangladesh (BD), Egypt (EG), China (CN) and Russian (RU).

  • changepoint_prior_scale (float) – Parameter controlling the flexibility of the automatic changepoint selection. Large values will allow many changepoints, potentially leading to overfitting.

  • changepoint_range (float) – Proportion of history in which trend changepoints will be estimated. Higher values means more changepoints, potentially leading to overfitting.

  • seasonality_mode (str) – Either ‘additive’ or ‘multiplicative’.

  • daily_seasonality (Union[str, bool, int]) – Can be ‘auto’, True, False, or a number of Fourier terms to generate.

  • weekly_seasonality (Union[str, bool, int]) – Can be ‘auto’, True, False, or a number of Fourier terms to generate.

  • yearly_seasonality (Union[str, bool, int]) – Can be ‘auto’, True, False, or a number of Fourier terms to generate.

  • add_seasonality (Optional[List]) – Manually add one or more seasonality components. Pass a list of dicts containing the keys name, period, fourier_order (obligatory), prior_scale and mode (optional).

  • seasonality_prior_scale (float) – Parameter controlling the strength of the seasonality model. Larger values allow the model to fit larger seasonal fluctuations, potentially leading to overfitting.

  • uncertainty_samples (int) – Number of simulated draws used to estimate uncertainty intervals.

  • mcmc_samples (int) – If >0, will do full Bayesian inference with the specified number of MCMC samples. If 0, will do MAP estimation.

fit(df)[source]

Fit Prophet model on normal (inlier) data.

Parameters:

df (DataFrame) – Dataframe with columns ds with timestamps and y with target values.

Return type:

None

predict(df, return_instance_score=True, return_forecast=True)[source]

Compute outlier scores and transform into outlier predictions.

Parameters:
  • df (DataFrame) – DataFrame with columns ds with timestamps and y with values which need to be flagged as outlier or not.

  • return_instance_score (bool) – Whether to return instance level outlier scores.

  • return_forecast (bool) – Whether to return the model forecast.

Return type:

Dict[Dict[str, str], Dict[DataFrame, DataFrame]]

Returns:

Dictionary containing 'meta' and 'data' dictionaries. –

  • 'meta' has the model’s metadata.

  • 'data' contains the outlier predictions, instance level outlier scores and the model forecast.

score(df)[source]

Compute outlier scores.

Parameters:

df (DataFrame) – DataFrame with columns ds with timestamps and y with values which need to be flagged as outlier or not.

Return type:

DataFrame

Returns:

Array with outlier scores for each instance in the batch.