brucethemoose
/

Capybara-Tess-Yi-34B-200K-DARE-Ties-4bpw-exl2-fiction

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

brucethemoose commited on Nov 25, 2023

Commit

da28be2

•

1 Parent(s): 3641f67

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -18,7 +18,7 @@ https://github.com/cg123/mergekit/tree/dare-tokenizer
 It was quantized with exllamav2 on 200 rows (400K tokens) on a long Vicuna format chat, a single sci fi story and a single fantasy story. This should hopefully yield better chat performance than the default wikitext quantization.
-Quantized to 4bpw, enough for **~47K context on a 24GB GPU.**
 ***
 Merged with the following config, and the tokenizer from Yi Llamafied:

 It was quantized with exllamav2 on 200 rows (400K tokens) on a long Vicuna format chat, a single sci fi story and a single fantasy story. This should hopefully yield better chat performance than the default wikitext quantization.
+Quantized to 4bpw, enough for **~45K context on a 24GB GPU.**
 ***
 Merged with the following config, and the tokenizer from Yi Llamafied: