alibi_detect.cd.fet module
- class alibi_detect.cd.fet.FETDrift(x_ref, p_val=0.05, x_ref_preprocessed=False, preprocess_at_init=True, update_x_ref=None, preprocess_fn=None, correction='bonferroni', alternative='greater', n_features=None, input_shape=None, data_type=None)[source]
Bases:
BaseUnivariateDrift
- __init__(x_ref, p_val=0.05, x_ref_preprocessed=False, preprocess_at_init=True, update_x_ref=None, preprocess_fn=None, correction='bonferroni', alternative='greater', n_features=None, input_shape=None, data_type=None)[source]
Fisher exact test (FET) data drift detector, which tests for a change in the mean of binary univariate data. For multivariate data, a separate FET test is applied to each feature, and the obtained p-values are aggregated via the Bonferroni or False Discovery Rate (FDR) corrections.
- Parameters:
x_ref (
Union
[ndarray
,list
]) – Data used as reference distribution. Data must consist of either [True, False]’s, or [0, 1]’s.p_val (
float
) – p-value used for significance of the FET test. If the FDR correction method is used, this corresponds to the acceptable q-value.x_ref_preprocessed (
bool
) – Whether the given reference data x_ref has been preprocessed yet. If x_ref_preprocessed=True, only the test data x will be preprocessed at prediction time. If x_ref_preprocessed=False, the reference data will also be preprocessed.preprocess_at_init (
bool
) – Whether to preprocess the reference data when the detector is instantiated. Otherwise, the reference data will be preprocessed at prediction time. Only applies if x_ref_preprocessed=False.update_x_ref (
Optional
[Dict
[str
,int
]]) – Reference data can optionally be updated to the last n instances seen by the detector or via reservoir sampling with size n. For the former, the parameter equals {‘last’: n} while for reservoir sampling {‘reservoir_sampling’: n} is passed.preprocess_fn (
Optional
[Callable
]) – Function to preprocess the data before computing the data drift metrics.correction (
str
) – Correction type for multivariate data. Either ‘bonferroni’ or ‘fdr’ (False Discovery Rate).alternative (
str
) – Defines the alternative hypothesis. Options are ‘greater’, ‘less’ or two-sided. These correspond to an increase, decrease, or any change in the mean of the Bernoulli data.n_features (
Optional
[int
]) – Number of features used in the FET test. No need to pass it if no preprocessing takes place. In case of a preprocessing step, this can also be inferred automatically but could be more expensive to compute.data_type (
Optional
[str
]) – Optionally specify the data type (tabular, image or time-series). Added to metadata.
- feature_score(x_ref, x)[source]
Performs Fisher exact test(s), computing the p-value per feature.
- Parameters:
x_ref (
ndarray
) – Reference instances to compare distribution with. Data must consist of either [True, False]’s, or [0, 1]’s.x (
ndarray
) – Batch of instances. Data must consist of either [True, False]’s, or [0, 1]’s.
- Return type:
Tuple
[ndarray
,ndarray
]- Returns:
Feature level p-values and odds ratios.