IQ3_M quant not working

by lightning-missile - opened Jun 27

Jun 27

Hello

I have downloaded the IQ3_M quant, but it is not working. I am using koboldcpp V1.68: Looking at the command line: There seem to be an error:

llm_load_vocab: bad special token: 'tokenizer.ggml.padding_token_id' = 32000d, using default id -1

during inferencing, koboldcpp just stops with the following message:

Processing Prompt [BLAS] (512 / 8092 tokens)
GGML_ASSERT: ggml-opencl.cpp:1815: to_fp32_cl != nullptr

What should I do?

Thanks.

mradermacher

Owner Jun 28

The IQ3_M works fine here with koboldcpp 1.68 and llama.cpp, so this is likely either a bug in koboldcpp or a misconfiguration (e.g. not enough vram). Very likely this is a bug in the opencl backend which has been removed in llama.cpp (and was never really supported afaik).

mradermacher changed discussion status to closed Jun 28

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment