IQ3_M quant not working
#1
by
lightning-missile
- opened
Hello
I have downloaded the IQ3_M quant, but it is not working. I am using koboldcpp V1.68: Looking at the command line: There seem to be an error:
llm_load_vocab: bad special token: 'tokenizer.ggml.padding_token_id' = 32000d, using default id -1
during inferencing, koboldcpp just stops with the following message:
Processing Prompt [BLAS] (512 / 8092 tokens)
GGML_ASSERT: ggml-opencl.cpp:1815: to_fp32_cl != nullptr
What should I do?
Thanks.
The IQ3_M works fine here with koboldcpp 1.68 and llama.cpp, so this is likely either a bug in koboldcpp or a misconfiguration (e.g. not enough vram). Very likely this is a bug in the opencl backend which has been removed in llama.cpp (and was never really supported afaik).
mradermacher
changed discussion status to
closed