alibi_detect.datasets module
- alibi_detect.datasets.corruption_types_cifar10c()[source]
Retrieve list with corruption types used in CIFAR-10-C.
- alibi_detect.datasets.fetch_attack(dataset, model, attack, return_X_y=False)[source]
Load adversarial instances for a given dataset, model and attack type.
- alibi_detect.datasets.fetch_cifar10c(corruption, severity, return_X_y=False)[source]
Fetch CIFAR-10-C data. Originally obtained from https://zenodo.org/record/2535967#.XkKh2XX7Qts and introduced in “Hendrycks, D and Dietterich, T.G. Benchmarking Neural Network Robustness to Common Corruptions and Perturbations. In 7th International Conference on Learning Represenations, 2019.”.
- Parameters:
corruption (
Union
[str
,List
[str
]]) – Corruption type. Options can be checked with get_corruption_cifar10c(). Alternatively, specify ‘all’ for all corruptions at a severity level.severity (
int
) – Severity level of corruption (1-5).return_X_y (
bool
) – Bool, whether to only return the data and target values or a Bunch object.
- Return type:
- Returns:
Bunch – Corrupted dataset with labels.
(corrupted data, target) – Tuple if ‘return_X_y’ equals True.
- alibi_detect.datasets.fetch_ecg(return_X_y=False)[source]
Fetch ECG5000 data. The dataset contains 5000 ECG’s, originally obtained from Physionet (https://archive.physionet.org/cgi-bin/atm/ATM) under the name “BIDMC Congestive Heart Failure Database(chfdb)”, record “chf07”.
- Parameters:
return_X_y (
bool
) – Bool, whether to only return the data and target values or a Bunch object.- Return type:
Union
[Bunch
,Tuple
[Tuple
[ndarray
,ndarray
],Tuple
[ndarray
,ndarray
]]]- Returns:
Bunch – Train and test datasets with labels.
(train data, train target), (test data, test target) – Tuple of tuples if ‘return_X_y’ equals True.
- alibi_detect.datasets.fetch_genome(return_X_y=False, return_labels=False)[source]
Load genome data including their labels and whether they are outliers or not. More details about the data can be found in the readme on https://console.cloud.google.com/storage/browser/seldon-datasets/genome/. The original data can be found here: https://drive.google.com/drive/folders/1Ht9xmzyYPbDouUTl_KQdLTJQYX2CuclR.
- Parameters:
- Return type:
- Returns:
Bunch – Training, validation and test data, whether they are outliers and optionally including the genome labels which are specified in the label_json key as a dictionary.
(data, outlier) or (data, outlier, target) – Tuple for the train, validation and test set with either the data and whether they are outliers or the data, outlier flag and labels for the genomes if ‘return_X_y’ equals True.
- alibi_detect.datasets.fetch_kdd(target=['dos', 'r2l', 'u2r', 'probe'], keep_cols=['srv_count', 'serror_rate', 'srv_serror_rate', 'rerror_rate', 'srv_rerror_rate', 'same_srv_rate', 'diff_srv_rate', 'srv_diff_host_rate', 'dst_host_count', 'dst_host_srv_count', 'dst_host_same_srv_rate', 'dst_host_diff_srv_rate', 'dst_host_same_src_port_rate', 'dst_host_srv_diff_host_rate', 'dst_host_serror_rate', 'dst_host_srv_serror_rate', 'dst_host_rerror_rate', 'dst_host_srv_rerror_rate'], percent10=True, return_X_y=False)[source]
KDD Cup ‘99 dataset. Detect computer network intrusions.
- Parameters:
- Return type:
- Returns:
Bunch – Dataset and outlier labels (0 means ‘normal’ and 1 means ‘outlier’).
(data, target) – Tuple if ‘return_X_y’ equals True.
- alibi_detect.datasets.fetch_nab(ts, return_X_y=False)[source]
Get time series in a DataFrame from the Numenta Anomaly Benchmark: https://github.com/numenta/NAB.
- Parameters:
- Return type:
- Returns:
Bunch – Dataset and outlier labels (0 means ‘normal’ and 1 means ‘outlier’) in DataFrames with timestamps.
(data, target) – Tuple if ‘return_X_y’ equals True.
- alibi_detect.datasets.get_list_nab()[source]
Get list of possible time series to retrieve from the Numenta Anomaly Benchmark: https://github.com/numenta/NAB.
- Return type:
- Returns:
List with time series names.
- alibi_detect.datasets.google_bucket_list(url, folder, filetype=None, full_path=False)[source]
Retrieve list with items in google bucket folder.
- alibi_detect.datasets.load_url_arff(url, dtype=<class 'numpy.float32'>)[source]
Load arff files from url.
- Parameters:
url (
str
) – Address of arff file.- Return type:
ndarray
- Returns:
Arrays with data and labels.
- alibi_detect.datasets.logger = <Logger alibi_detect.datasets (WARNING)>
Number of seconds to wait for URL requests before raising an error.