Really nice little model
#2
by
cnmoro
- opened
I found the performance to be really good for a model of this size!
Would be awesome to see the same process applied to the 0.5B version of Qwen2.
Congrats to the arcee team
I have used this one for general chat and tool calling, and it has been great in real-world examples.
I run this on AMD Ryzen 9 3950X 16-Core CPU and GPU (AMD 5700 XT 8GB using Vulkan and Nvidia 3090 24GB using CUDA).
I use llama.cpp/GGUF 4_K_M version.
Thanks for the feedback! It's very helpful to hear how people are using the models as we look towards newer, improved versions.