How are you quantizing?

#1
by moona99 - opened

Im trying to convert and quantize a llama-3 in llamacpp. convert.py is not working with many models now. convert-hf-to-gguf.py isn't working because every safetensors repo doesn't have a tokenizer.model file. So Im curious how this is getting done.

current convert-hf-to-gguf.py is the only correct method for llama-3 and generally does work (it does not require a tokenizer.model file). unfortunately, this specific model is not supported because the tokenizer does not match, but support can added via convert-hf-to-gguf-update.py, which is (kind of) what I did.

specifically, this is the script I used for this model: http://data.plan9.de/convert-hfhfix-to-gguf.py

mradermacher changed discussion status to closed

ah... Thank you for taking the time to enlighten me! I missed that there was an update script. I will give that a try. Thank you!

You are welcome, good luck!

Sign up or log in to comment