with load_in_4bit it just generates <pad> tokens
#16
by
NePe
- opened
I used the example from the model card with the latest version of transformers.
You should use torch_dtype=torch.bfloat16 for it to work.
Thanks, this fixed my issue!
NePe
changed discussion status to
closed