alibi_detect.od.isolationforest module

class alibi_detect.od.isolationforest.IForest(threshold=None, n_estimators=100, max_samples='auto', max_features=1.0, bootstrap=False, n_jobs=1, data_type='tabular')[source]

Bases: BaseDetector, FitMixin, ThresholdMixin

__init__(threshold=None, n_estimators=100, max_samples='auto', max_features=1.0, bootstrap=False, n_jobs=1, data_type='tabular')[source]

Outlier detector for tabular data using isolation forests.

Parameters:
  • threshold (Optional[float]) – Threshold used for outlier score to determine outliers.

  • n_estimators (int) – Number of base estimators in the ensemble.

  • max_samples (Union[str, int, float]) – Number of samples to draw from the training data to train each base estimator. If int, draw ‘max_samples’ samples. If float, draw ‘max_samples * number of features’ samples. If ‘auto’, max_samples = min(256, number of samples)

  • max_features (Union[int, float]) – Number of features to draw from the training data to train each base estimator. If int, draw ‘max_features’ features. If float, draw ‘max_features * number of features’ features.

  • bootstrap (bool) – Whether to fit individual trees on random subsets of the training data, sampled with replacement.

  • n_jobs (int) – Number of jobs to run in parallel for ‘fit’ and ‘predict’.

  • data_type (str) – Optionally specify the data type (tabular, image or time-series). Added to metadata.

fit(X, sample_weight=None)[source]

Fit isolation forest.

Parameters:
  • X (ndarray) – Training batch.

  • sample_weight (Optional[ndarray]) – Sample weights.

Return type:

None

infer_threshold(X, threshold_perc=95.0)[source]

Update threshold by a value inferred from the percentage of instances considered to be outliers in a sample of the dataset.

Parameters:
  • X (ndarray) – Batch of instances.

  • threshold_perc (float) – Percentage of X considered to be normal based on the outlier score.

Return type:

None

predict(X, return_instance_score=True)[source]

Compute outlier scores and transform into outlier predictions.

Parameters:
  • X (ndarray) – Batch of instances.

  • return_instance_score (bool) – Whether to return instance level outlier scores.

Return type:

Dict[Dict[str, str], Dict[ndarray, ndarray]]

Returns:

Dictionary containing 'meta' and 'data' dictionaries. –

  • 'meta' has the model’s metadata.

  • 'data' contains the outlier predictions and instance level outlier scores.

score(X)[source]

Compute outlier scores.

Parameters:

X (ndarray) – Batch of instances to analyze.

Return type:

ndarray

Returns:

Array with outlier scores for each instance in the batch.