Original repo: https://huggingface.co/openlm-research/open_llama_3b
This repo just allows the tokenizer to allow the use of use_fast = True
to work, which can speed up batched tokenization dramatically.
This repo DOES NOT host OpenLLAMA's models. For those, use OpenLLAMA's repo.
For eg:
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("danielhanchen/open_llama_3b")
model = AutoModelForCausalLM.from_pretrained("openlm-research/open_llama_3b")