More quants?
#5
by
MoonRide
- opened
Could you please provide more quants, like also Q6_K, Q5_K_M? Those are better quality than Q4 (sample results in https://github.com/ggerganov/llama.cpp/blob/master/examples/perplexity/README.md).
For anyone interested - quants created from F16 via:quantize Phi-3-mini-4k-instruct-fp16.gguf Phi-3-mini-4k-instruct-Q6_K.gguf Q6_K
work fine (tested using llama.cpp b2714).
What quantize command are you using there?
Figured it out, it was this one: https://github.com/ggerganov/llama.cpp/blob/master/examples/quantize/quantize.cpp
@simonw It's just a standard exacutable included in llama.cpp I've mentioned earlier (you can just check out binary releases of llama.cpp, published at https://github.com/ggerganov/llama.cpp/releases).
gugarosa
changed discussion status to
closed