vocab size mismatch
config.json says vocab_size is 32000 while instruct uses 32768. it also doesn't load properly. I don't think the vocab size should differ, and I suspect the size should be set to 32768 just like in the instruct model (which does load).
The problem is that the vocab size is different. The model state dict is actually 32000 as in the config.
Hi Mixtral team, when loading Mixtral-8x22B with AutoTokenizer, the vocab_size gets 32768, however, the config.json shows vocab_size is 32000, and the embedding layer has a shape of 32000x6144, why there's a mismatch? How to deal with the mismatch?
Hi Mistral team! The current Mixtral-8x22B-v0.1 has tokenizer mismatch issue. I can load the tokenizer from Mixtral-8x7B-v0.1 to solve the problem, but when will the repo updated just like current Mixtral-8x22B-Instruct-v0.1 which now has correct tokenizer? Thanks!
I've merged a revision that should fix this issue!