GPU acceleration working in Oobagooba Webui
Idk if they made an update since you made your model card for the airoboros 70b ggml model you posted, but gpu acceleration is working fine in oobagooba in this model, llama-2-70b-guanaco-QLoRA-GGML
error loading model: llama.cpp: tensor 'layers.0.attention.wk.weight' has wrong shape; expected 8192 x 8192, got 8192 x 1024
Got this error. Loaded using llama-cpp-python in Linux. Python3.11, llama-cpp-python 0.1.77
@rombodawg thank you, I've updated my READMEs
@ThamaluM
this is probably because you've not passed the -gqa 8
parameter. I don't know how you do that with llama-cpp-python, but there must be a way to do it as text-generation-webui works OK with these models and it uses llama-cpp-python. Check the llama-cpp-python repo and see if there's instructions there.
Thank for the answer it worked.