Run on M2 Macbook?
First, thanks for all the hardwork for everyone involved.
I'm a bit of a noob with all of this stuff. I do have llama.cpp already. But which version do I use for a M2 Macbook Pro? From what I found I think I have to use the GGML variant from this repo, but what about q4_0, q5_0 etc? And are there any params I have to mess with on Llama.cpp once I add it to /model folder? Thanks!
READ model card ...
And how to install this, I don't have the slightest thought, help.
If you read and can't understand it .... you have to learn from the beginning...
That instruction from model card is very straight forward.
And how to install this, I don't have the slightest thought, help.
I'll give you a simple way to get started. No priot knowlegde needed and with a simple open source interface (I do not own this program or have any affiliation with it but it's awsome). Download Faraday https://faraday.dev/ when downloaded you can download a character. For example, I downloaded a character that helps me in a way chatGPT might. But before you can use it you have to download a model. There is models on the app that you can download with one click, but WizardLM is not on there. So basically you go the download page on this hugginface page, you press on the download button and wait for the model to download. In Faraday you can open the model download folder where it would download your model and paste in the model.
A brief description on different models: If the model ends with a .bin or .gguf it probably works (WizardLM is .bin), the bigger the model the better it is generally, but to start it is usually recommended to start with a "Q4" model and "K_M" if there is a K options ("M" stands for medium so it's medium heavy). Paste it in the folder, make your character and select your mode. Now you are done and it works if your computer can handle it. Hope this helps!