Text Generation
Transformers
PyTorch
llama
text-generation-inference
Inference Endpoints
New discussion

set use_cache=true for faster decoding

#27 opened about 1 year ago by zxcvvxcz

Update README.md

#16 opened about 1 year ago by haipeng1

Request: Wizardlm-22b

#12 opened about 1 year ago by rombodawg

inference speed is considerably slow

#11 opened over 1 year ago by sonald

Missing model card & datasets info

#8 opened over 1 year ago by markding

Still non-commercial?

#6 opened over 1 year ago by kalijason

database connection ?

#4 opened over 1 year ago by nobitha

What is the prompt format?

1
#3 opened over 1 year ago by TheBloke

What is this model based off of?

1
#2 opened over 1 year ago by rombodawg

Dataset Availability?

#1 opened over 1 year ago by jonfairbanks