Details about GPTQ, GGUF, HF, GGML, AWQ, fp16, Uncensored of each

#1
by Rohith1016 - opened

Can you tell me the different of this all type of model :
GPTQ, GGUF, HF, GGML, AWQ, fp16
Uncensored GPTQ, Uncensored GGUF, Uncensored HF, Uncensored fp16, Uncensored AWQ, Uncensored GGML
And which one is better And fast Uncensored one or without Uncensored one and which model is fast and good on Google colab T4 GPU

Uncensored just means that it will try to answer questions and not refuse nsfw questions. It’s not a format

Gguf is a format for llama cpp best for cpu or Mac
Gptq with exllama will be fastest format and
Awq should be Hughes quality
Fp16 is the original model unquantized
Ggml is an old format similar to gguf
Hf is a format designed for huggingface transformers which is usually fp16

Thank you for your Response and explanation @johnwick123forevr

Rohith1016 changed discussion status to closed
Rohith1016 changed discussion status to open
Rohith1016 changed discussion status to closed

Sign up or log in to comment