TheBloke commited on
Commit
5fbe47e
1 Parent(s): a4d7e9c

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -3
README.md CHANGED
@@ -21,7 +21,7 @@ inference: false
21
 
22
  These files are GPTQ 4bit model files for [Sambanova Systems' BLOOMChat 1.0](https://huggingface.co/sambanovasystems/BLOOMChat-176B-v1).
23
 
24
- It is the result of quantising to 4bit using [AutoGPTQ](https://github.com/PanQiWei/AutoGPTQ).
25
 
26
  **This is a BIG model! 2 x 80GB or 3 x 48GB GPUs are required**
27
 
@@ -210,8 +210,6 @@ It was created with group_size none (-1) to reduce VRAM usage, and with --act-or
210
 
211
  This will work with AutoGPTQ. It is untested with GPTQ-for-LLaMa. It will *not* work with ExLlama.
212
 
213
- It was created with both group_size 128g and --act-order (desc_act) for increased inference quality.
214
-
215
  It was created with both group_size 128g and --act-order (desc_act) for even higher inference accuracy, at the cost of increased VRAM usage. Because we already need 2 x 80GB or 3 x 48GB GPUs, I don't expect the increased VRAM usage to change the GPU requirements.
216
 
217
  * `gptq_model-4bit-128g.safetensors`
 
21
 
22
  These files are GPTQ 4bit model files for [Sambanova Systems' BLOOMChat 1.0](https://huggingface.co/sambanovasystems/BLOOMChat-176B-v1).
23
 
24
+ It is the result of quantising to 4-bit using [AutoGPTQ](https://github.com/PanQiWei/AutoGPTQ).
25
 
26
  **This is a BIG model! 2 x 80GB or 3 x 48GB GPUs are required**
27
 
 
210
 
211
  This will work with AutoGPTQ. It is untested with GPTQ-for-LLaMa. It will *not* work with ExLlama.
212
 
 
 
213
  It was created with both group_size 128g and --act-order (desc_act) for even higher inference accuracy, at the cost of increased VRAM usage. Because we already need 2 x 80GB or 3 x 48GB GPUs, I don't expect the increased VRAM usage to change the GPU requirements.
214
 
215
  * `gptq_model-4bit-128g.safetensors`