metadata
license: llama2
This is a slightly modified versions of the original WizardLM/WizardLM-13B-V1.2 checkpoint that fixes a few bugs:
- In the original checkpoint, the BOS token is set to the EOS token (
</s>
, token ID 2). In this version, the BOS is reverted to<s>
(token ID 1). - The original has a mismatch between the size of the tokenizer vocab and the model embedding vocab. This is because the tokenizer includes an extra token for the added
[PAD]
token, making the vocab 32,001 tokens. This discrepancy can cause index errors. This version simply removes the added[PAD]
in favor of using the<unk>
(token ID 0) for padding. So the tokenizer's vocab is reverted back to a size of 32,000 to match the model's vocab size.
For all other information about this model, refer to the original WizardLM/WizardLM-13B-V1.2 checkpoint.