hidden_states dimensionality
Hey, im playing around with your model and trying to figure out if i can use the hidden_states for semantic search.
Can you explain to me why for an empty input, the dimensionality of the hidden_state is torch.Size([25, 1, 2, 1024])?
As far as i can see the encoder Robertaencoder has 24*RobertaLayer, where is the 25 coming from?
Shouldnt the dimensionality be num_hidden_layers * 1 * tokens * hidden_size?
Hey,
Thank you for using this model! Could you please provide a code snippet, so I can know what you are trying to do?
Sure changed config to "output_hidden_states:true"
from transformers import (
TokenClassificationPipeline,
AutoModelForTokenClassification,
AutoTokenizer,
)
Define keyphrase extraction pipeline
class KeyphraseExtractionPipeline(TokenClassificationPipeline):
def init(self, model, *args, **kwargs):
super().init(
model=AutoModelForTokenClassification.from_pretrained(model_path),
tokenizer=AutoTokenizer.from_pretrained(model),
*args,
**kwargs
)
def _forward(self, model_inputs):
# Forward
special_tokens_mask = model_inputs.pop("special_tokens_mask")
offset_mapping = model_inputs.pop("offset_mapping", None)
sentence = model_inputs.pop("sentence")
outputs = self.model(**model_inputs)
logits = outputs[0]
# I am talking abouth these outputs, they now contain ["hidden_state"]
embedding = get_embeddings(outputs[1])
return {
"logits": logits,
"special_tokens_mask": special_tokens_mask,
"offset_mapping": offset_mapping,
"sentence": sentence,
"hidden_state": outputs[1],
"embedding": embedding,
**model_inputs,
}
def postprocess(self, model_outputs):
results = super().postprocess(
model_outputs=model_outputs,
aggregation_strategy=AggregationStrategy.SIMPLE,
)
return {**model_outputs, **{"keywords": np.unique([result.get("word").strip() for result in results]).tolist()}}
model_path = "keyphrase-extraction-kbir-inspec"
extractor = KeyphraseExtractionPipeline(model=model_path)
Hey @Janni ,
There is a reason why there are 25 layers. The output of the embeddings is also included. So the hidden layers exist out of the output of each layer + the output of the embeddings.
You can find more information in this GitHub thread: https://github.com/huggingface/transformers/issues/1332.
You can also find this in the HuggingFace Roberta documentation.
Hope it helps!
Kind regards,
Thomas De Decker