TheBloke
/

samantha-33B-GPTQ

Text Generation

text-generation-inference

4-bit precision

Model card Files Files and versions Community

TheBloke commited on May 29, 2023

Commit

008ab86

•

1 Parent(s): 66fd6f3

Update README.md

Files changed (1) hide show

README.md +6 -6

README.md CHANGED Viewed

@@ -48,27 +48,27 @@ Open the text-generation-webui UI as normal.
 5. Click the **Refresh** icon next to **Model** in the top left.
 6. In the **Model drop-down**: choose the model you just downloaded, `Samantha-33B-GPTQ`.
 7. If you see an error in the bottom right, ignore it - it's temporary.
-8. Fill out the `GPTQ parameters` on the right: `Bits = 4`, `Groupsize = 128`, `model_type = Llama`
 9. Click **Save settings for this model** in the top right.
 10. Click **Reload the Model** in the top right.
 11. Once it says it's loaded, click the **Text Generation tab** and enter a prompt!
 ## Provided files
-**Samantha-33B-GPTQ-4bit-128g.no-act-order.safetensors**
 This will work with all versions of GPTQ-for-LLaMa. It has maximum compatibility.
-It was created with groupsize 128 to ensure higher quality inference, without `--act-order` parameter to maximise compatibility.
-* `Samantha-33B-GPTQ-4bit-128g.no-act-order.safetensors`
   * Works with all versions of GPTQ-for-LLaMa code, both Triton and CUDA branches
   * Works with AutoGPTQ
   * Works with text-generation-webui one-click-installers
-  * Parameters: Groupsize = 128. No act-order.
   * Command used to create the GPTQ:
     ```
-     python llama.py /workspace/process/samantha-33B/HF  wikitext2 --wbits 4 --true-sequential --groupsize 128 --save_safetensors /workspace/process/Samantha-33B-GPTQ-4bit-128g.no-act-order.safetensors
      ```
 ## Want to support my work?

 5. Click the **Refresh** icon next to **Model** in the top left.
 6. In the **Model drop-down**: choose the model you just downloaded, `Samantha-33B-GPTQ`.
 7. If you see an error in the bottom right, ignore it - it's temporary.
+8. Fill out the `GPTQ parameters` on the right: `Bits = 4`, `Groupsize = None`, `model_type = Llama`
 9. Click **Save settings for this model** in the top right.
 10. Click **Reload the Model** in the top right.
 11. Once it says it's loaded, click the **Text Generation tab** and enter a prompt!
 ## Provided files
+**Samantha-33B-GPTQ-4bit.act-order.safetensors**
 This will work with all versions of GPTQ-for-LLaMa. It has maximum compatibility.
+It was created with no groupsize to minimise VRAM requirements, and with act-order to ensure highest possible inference quality.
+* `Samantha-33B-GPTQ-4bit.act-order.safetensors`
   * Works with all versions of GPTQ-for-LLaMa code, both Triton and CUDA branches
   * Works with AutoGPTQ
   * Works with text-generation-webui one-click-installers
+  * Parameters: Groupsize = None. Act-order.
   * Command used to create the GPTQ:
     ```
+     python llama.py /workspace/process/samantha-33B/HF  wikitext2 --wbits 4 --true-sequential --act-order --save_safetensors /workspace/process/Samantha-33B-GPTQ-4bit.act-order.safetensors
      ```
 ## Want to support my work?