alibi_detect.models.tensorflow.embedding module
- class alibi_detect.models.tensorflow.embedding.TransformerEmbedding(model_name_or_path, embedding_type, layers=None)[source]
Bases:
Model
- __init__(model_name_or_path, embedding_type, layers=None)[source]
Extract text embeddings from transformer models.
- Parameters:
model_name_or_path (
str
) – Name of or path to the model.embedding_type (
str
) –Type of embedding to extract. Needs to be one of pooler_output, last_hidden_state, hidden_state or hidden_state_cls.
From the HuggingFace documentation:
- pooler_output
Last layer hidden-state of the first token of the sequence (classification token) further processed by a Linear layer and a Tanh activation function. The Linear layer weights are trained from the next sentence prediction (classification) objective during pre-training. This output is usually not a good summary of the semantic content of the input, you’re often better with averaging or pooling the sequence of hidden-states for the whole input sequence.
- last_hidden_state
Sequence of hidden-states at the output of the last layer of the model.
- hidden_state
Hidden states of the model at the output of each layer.
- hidden_state_cls
See hidden_state but use the CLS token output.
layers (
Optional
[List
[int
]]) – If “hidden_state” or “hidden_state_cls” is used as embedding type, layers has to be a list with int’s referring to the hidden layers used to extract the embedding.
Extract embeddings from hidden attention state layers.
- Parameters:
- Return type:
Tensor
- Returns:
Tensor with embeddings.