alibi.explainers.anchor_text module

class alibi.explainers.anchor_text.AnchorText(nlp, predictor, seed=None)[source]

Bases: alibi.api.interfaces.Explainer

__init__(nlp, predictor, seed=None)[source]

Initialize anchor text explainer.

  • nlp (spacy.language.Language) – spaCy object.

  • predictor (Callable) – A callable that takes a tensor of N data points as inputs and returns N outputs.

  • seed (int) – If set, ensures identical random streams.

Return type


build_explanation(text, result, predicted_label, params)[source]

Uses the metadata returned by the anchor search algorithm together with the instance to be explained to build an explanation object.

  • text (str) – Instance to be explained.

  • result (dict) – Dictionary containing the search result and metadata.

  • predicted_label (int) – Label of the instance to be explained. Inferred if not received.

  • params (dict) – Parameters passed to explain

Return type



Compute the agreement between a classifier prediction on an instance to be explained and the prediction on a set of samples which have a subset of features fixed to a given value (aka compute the precision of anchors).


samples (ndarray) – Samples whose labels are to be compared with the instance label.

Return type



A boolean array indicating whether the prediction was the same as the instance label.

explain(text, use_unk=True, use_similarity_proba=False, sample_proba=0.5, top_n=100, temperature=1.0, threshold=0.95, delta=0.1, tau=0.15, batch_size=100, coverage_samples=10000, beam_size=1, stop_on_first=True, max_anchor_size=None, min_samples_start=100, n_covered_ex=10, binary_cache_size=10000, cache_margin=1000, verbose=False, verbose_every=1, **kwargs)[source]

Explain instance and return anchor with metadata.

  • text (str) – Text instance to be explained.

  • use_unk (bool) – If True, perturbation distribution will replace words randomly with UNKs. If False, words will be replaced by similar words using word embeddings.

  • use_similarity_proba (bool) – Sample according to a similarity score with the corpus embeddings use_unk needs to be False in order for this to be used.

  • sample_proba (float) – Sample probability if use_similarity_proba is False.

  • top_n (int) – Number of similar words to sample for perturbations, only used if use_proba=True.

  • temperature (float) – Sample weight hyperparameter if use_similarity_proba equals True.

  • threshold (float) – Minimum precision threshold.

  • delta (float) – Used to compute beta.

  • tau (float) – Margin between lower confidence bound and minimum precision or upper bound.

  • batch_size (int) – Batch size used for sampling.

  • coverage_samples (int) – Number of samples used to estimate coverage from during anchor search.

  • beam_size (int) – Number of options kept after each stage of anchor building.

  • stop_on_first (bool) – If True, the beam search algorithm will return the first anchor that has satisfies the probability constraint.

  • max_anchor_size (Optional[int]) – Maximum number of features to include in an anchor.

  • min_samples_start (int) – Number of samples used for anchor search initialisation.

  • n_covered_ex (int) – How many examples where anchors apply to store for each anchor sampled during search (both examples where prediction on samples agrees/disagrees with predicted label are stored).

  • binary_cache_size (int) – The anchor search pre-allocates binary_cache_size batches for storing the boolean arrays returned during sampling.

  • cache_margin (int) – When only max(cache_margin, batch_size) positions in the binary cache remain empty, a new cache of the same size is pre-allocated to continue buffering samples.

  • kwargs (Any) – Other keyword arguments passed to the anchor beam search and the text sampling and perturbation functions.

  • verbose (bool) – Display updates during the anchor search iterations.

  • verbose_every (int) – Frequency of displayed iterations during anchor search process.

Return type



explanation – Dictionary containing the anchor explaining the instance with additional metadata.


This function queries a spaCy nlp model to find n similar words with the same part of speech for each word in the instance to be explained. For each word the search procedure returns a dictionary containing an np.array of words (‘words’) and an np.array of word similarities (‘similarities’).

Return type


perturb_sentence(present, n, sample_proba=0.5, forbidden=frozenset({}), forbidden_tags=frozenset({'PRP$'}), forbidden_words=frozenset({'be'}), temperature=1.0, pos=frozenset({'ADJ', 'ADP', 'ADV', 'DET', 'NOUN', 'VERB'}), use_similarity_proba=True, **kwargs)[source]

Perturb the text instance to be explained.

  • present (tuple) – Word index in the text for the words in the proposed anchor.

  • n (int) – Number of samples used when sampling from the corpus.

  • sample_proba (float) – Sample probability for a word if use_similarity_proba is False.

  • forbidden (frozenset) – Forbidden lemmas.

  • forbidden_tags (frozenset) – Forbidden POS tags.

  • forbidden_words (frozenset) – Forbidden words.

  • pos (frozenset) – POS that can be changed during perturbation.

  • use_similarity_proba (bool) – Bool whether to sample according to a similarity score with the corpus embeddings.

  • temperature (float) – Sample weight hyperparameter if use_similarity_proba equals True.

Return type

Tuple[ndarray, ndarray]


  • raw_data – Array of perturbed text instances.

  • data – Matrix with 1s and 0s indicating whether a word in the text has not been perturbed for each sample.

sampler(anchor, num_samples, compute_labels=True)[source]

Generate perturbed samples while maintaining features in positions specified in anchor unchanged.

  • anchor (Tuple[int, tuple]) – int: the position of the anchor in the input batch tuple: the anchor itself, a list of words to be kept unchanged

  • num_samples (int) – Number of generated perturbed samples.

  • compute_labels (bool) – If True, an array of comparisons between predictions on perturbed samples and instance to be explained is returned.

Return type

Union[List[Union[ndarray, float, int]], List[ndarray]]


  • If compute_labels=True, a list containing the following is returned

    • covered_true: perturbed examples where the anchor applies and the model prediction

      on perturbation is the same as the instance prediction

    • covered_false: perturbed examples where the anchor applies and the model prediction

      is NOT the same as the instance prediction

    • labels: num_samples ints indicating whether the prediction on the perturbed sample

      matches (1) the label of the instance to be explained or not (0)

    • data: Matrix with 1s and 0s indicating whether a word in the text has been

      perturbed for each sample

    • 1.0: indicates exact coverage is not computed for this algorithm

    • anchor[0]: position of anchor in the batch request

  • Otherwise, a list containing the data matrix only is returned.


Working with numpy arrays of strings requires setting the data type to avoid truncating examples. This function estimates the longest sentence expected during the sampling process, which is used to set the number of characters for the samples and examples arrays. This depends on the perturbation method used for sampling.


use_unk (bool) – See explain method.

Return type


set_sampler_perturbation(use_unk, perturb_opts)[source]

Initialises the explainer by setting the perturbation function and parameters necessary to sample according to the perturbation method.

  • use_unk (bool) – see explain method

  • perturb_opts (dict) –

    A dict with keys:

    ’top_n’: the max number of alternatives to sample from for replacement ‘use_similarity_proba’: if True the probability of selecting a replacement

    word is prop. to the similarity between the word and the word to be replaced

    ’sample_proba’: given a feature and n sentences, this parameters is the mean of a

    Bernoulli distribution used to decide how many sentences will have that feature perturbed

    ’temperature’: a tempature used to callibrate the softmax distribution over the

    sampling weights.

Return type



Process the sentence to be explained into spaCy token objects, a list of words, punctuation marks and a list of positions in input sentence.


text (str) – The instance to be explained.

Return type


class alibi.explainers.anchor_text.Neighbors(nlp_obj, n_similar=500, w_prob=-15.0)[source]

Bases: object

__init__(nlp_obj, n_similar=500, w_prob=-15.0)[source]

Initialize class identifying neighbouring words from the embedding for a given word.

  • nlp_obj (spacy.language.Language) – spaCy model

  • n_similar (int) – Number of similar words to return.

  • w_prob (float) – Smoothed log probability estimate of token’s type.

Return type


neighbors(word, tag, top_n)[source]

Find similar words for a certain word in the vocabulary.

  • word (str) – Word for which we need to find similar words.

  • tag (str) – Part of speech tag for the words.

  • top_n (int) – Return only top_n neighbors.

Return type



  • A dict with two fields. The ‘words’ field contains a numpy array

  • of the top_n most similar words, whereas the fields similarity is

  • a numpy array with corresponding word similarities.