Hi, I have taken a challenge to run this model on the Nvidia Jetson nano 4GB developer kit. I used ollama container and downloaded it, but it works quite slow.(it is quanti Do anyone have an idea how can i make the inference faster? did anyone tried it?