Mistral doesn't have a `pad_token_id`? 🤔

#66

by ingo-m - opened Nov 3, 2023

Discussion

ingo-m

Nov 3, 2023

According to the documentation, the pad_token_id is optional?

As confirmed by:

from transformers import AutoTokenizer

base_model_name = "mistralai/Mistral-7B-Instruct-v0.1"

tokenizer = AutoTokenizer.from_pretrained(base_model_name)

print(tokenizer.pad_token_id)
# None

I don't understand why, surely a padding token must have been used during training?

xceman

Dec 22, 2023

I encountered the same issue when I tried to fine-tune it, and wondered how it is supposed to set? Thanks!

ingo-m

Dec 22, 2023

There's also a question about it on reddit:
https://www.reddit.com/r/LocalLLaMA/comments/184g120/mistral_fine_tuning_eos_and_padding/

I'm wondering whether it even matter what's the padding token as long as it is masked out with the attention mask?

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment