IQ3_M quant not working

#1
by lightning-missile - opened

Hello

I have downloaded the IQ3_M quant, but it is not working. I am using koboldcpp V1.68: Looking at the command line: There seem to be an error:

llm_load_vocab: bad special token: 'tokenizer.ggml.padding_token_id' = 32000d, using default id -1

during inferencing, koboldcpp just stops with the following message:

Processing Prompt [BLAS] (512 / 8092 tokens)
GGML_ASSERT: ggml-opencl.cpp:1815: to_fp32_cl != nullptr

What should I do?

Thanks.

The IQ3_M works fine here with koboldcpp 1.68 and llama.cpp, so this is likely either a bug in koboldcpp or a misconfiguration (e.g. not enough vram). Very likely this is a bug in the opencl backend which has been removed in llama.cpp (and was never really supported afaik).

mradermacher changed discussion status to closed

Sign up or log in to comment