Text Generation
Transformers
PyTorch
llama
text-generation-inference
Inference Endpoints

set use_cache=true for faster decoding

#27
by zxcvvxcz - opened
Ready to merge
This branch is ready to get merged automatically.

Sign up or log in to comment