Text Generation
Transformers
PyTorch
llama
text-generation-inference
Inference Endpoints

inference speed is considerably slow

#11
by sonald - opened

compare to ther 13B models, this model is quite slow, any ideas on why ?

Sign up or log in to comment