Model Embedding and vocab size mismatch.

#7
by pg20sanger - opened

Model config.json (https://huggingface.co/InstaDeepAI/nucleotide-transformer-2.5b-multi-species/blob/main/config.json#L27) says that the vocab size is 4105 , but vocab.txt has 4107 tokens. Is it correct?

Sign up or log in to comment