Edit model card

NEO CLASS Ultra Quants for : TinyLlama-1.1B-Chat-v1.0-Ultra-NEO-V1-Imatrix-GGUF

The NEO Class tech was created after countless investigations and over 120 lab experiments backed by real world testing and qualitative results.

NEO Class results:

Better overall function, instruction following, output quality and stronger connections to ideas, concepts and the world in general.

In addition quants now operate above their "grade" so to speak :

IE: Q4 / IQ4 operate at Q5KM/Q6 levels.

Likewise for Q3/IQ3 operate at Q4KM/Q5 levels.

Perplexity drop of 591 points for Neo Class Imatrix quant of IQ4XS VS regular quant of IQ4XS.

(lower is better)

For experimental "X" quants of this model please go here:

[ https://huggingface.co/DavidAU/TinyLlama-1.1B-Chat-v1.0-Ultra-NEO-V1-X-Imatrix-GGUF ]

Model Notes:

Maximum context is 2k. Please see original model maker's page for details, and usage information for this model.

Special thanks to the model creators at TinyLLama for making such a fantastic model:

[ https://huggingface.co/TinyLlama/TinyLlama-1.1B-Chat-v1.0 ]

Downloads last month
745
GGUF
Model size
1.1B params
Architecture
llama

1-bit

2-bit

3-bit

4-bit

5-bit

6-bit

8-bit

Inference Examples
Inference API (serverless) is not available, repository is disabled.

Collections including DavidAU/TinyLlama-1.1B-Chat-v1.0-Ultra-NEO-V1-Imatrix-GGUF