Mistral doesn't have a `pad_token_id`? π€
#66
by
ingo-m
- opened
According to the documentation, the pad_token_id
is optional?
As confirmed by:
from transformers import AutoTokenizer
base_model_name = "mistralai/Mistral-7B-Instruct-v0.1"
tokenizer = AutoTokenizer.from_pretrained(base_model_name)
print(tokenizer.pad_token_id)
# None
I don't understand why, surely a padding token must have been used during training?
I encountered the same issue when I tried to fine-tune it, and wondered how it is supposed to set? Thanks!
There's also a question about it on reddit:
https://www.reddit.com/r/LocalLLaMA/comments/184g120/mistral_fine_tuning_eos_and_padding/
I'm wondering whether it even matter what's the padding token as long as it is masked out with the attention mask?