TheBloke commited on
Commit
008ab86
1 Parent(s): 66fd6f3

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +6 -6
README.md CHANGED
@@ -48,27 +48,27 @@ Open the text-generation-webui UI as normal.
48
  5. Click the **Refresh** icon next to **Model** in the top left.
49
  6. In the **Model drop-down**: choose the model you just downloaded, `Samantha-33B-GPTQ`.
50
  7. If you see an error in the bottom right, ignore it - it's temporary.
51
- 8. Fill out the `GPTQ parameters` on the right: `Bits = 4`, `Groupsize = 128`, `model_type = Llama`
52
  9. Click **Save settings for this model** in the top right.
53
  10. Click **Reload the Model** in the top right.
54
  11. Once it says it's loaded, click the **Text Generation tab** and enter a prompt!
55
 
56
  ## Provided files
57
 
58
- **Samantha-33B-GPTQ-4bit-128g.no-act-order.safetensors**
59
 
60
  This will work with all versions of GPTQ-for-LLaMa. It has maximum compatibility.
61
 
62
- It was created with groupsize 128 to ensure higher quality inference, without `--act-order` parameter to maximise compatibility.
63
 
64
- * `Samantha-33B-GPTQ-4bit-128g.no-act-order.safetensors`
65
  * Works with all versions of GPTQ-for-LLaMa code, both Triton and CUDA branches
66
  * Works with AutoGPTQ
67
  * Works with text-generation-webui one-click-installers
68
- * Parameters: Groupsize = 128. No act-order.
69
  * Command used to create the GPTQ:
70
  ```
71
- python llama.py /workspace/process/samantha-33B/HF wikitext2 --wbits 4 --true-sequential --groupsize 128 --save_safetensors /workspace/process/Samantha-33B-GPTQ-4bit-128g.no-act-order.safetensors
72
  ```
73
  ## Want to support my work?
74
 
 
48
  5. Click the **Refresh** icon next to **Model** in the top left.
49
  6. In the **Model drop-down**: choose the model you just downloaded, `Samantha-33B-GPTQ`.
50
  7. If you see an error in the bottom right, ignore it - it's temporary.
51
+ 8. Fill out the `GPTQ parameters` on the right: `Bits = 4`, `Groupsize = None`, `model_type = Llama`
52
  9. Click **Save settings for this model** in the top right.
53
  10. Click **Reload the Model** in the top right.
54
  11. Once it says it's loaded, click the **Text Generation tab** and enter a prompt!
55
 
56
  ## Provided files
57
 
58
+ **Samantha-33B-GPTQ-4bit.act-order.safetensors**
59
 
60
  This will work with all versions of GPTQ-for-LLaMa. It has maximum compatibility.
61
 
62
+ It was created with no groupsize to minimise VRAM requirements, and with act-order to ensure highest possible inference quality.
63
 
64
+ * `Samantha-33B-GPTQ-4bit.act-order.safetensors`
65
  * Works with all versions of GPTQ-for-LLaMa code, both Triton and CUDA branches
66
  * Works with AutoGPTQ
67
  * Works with text-generation-webui one-click-installers
68
+ * Parameters: Groupsize = None. Act-order.
69
  * Command used to create the GPTQ:
70
  ```
71
+ python llama.py /workspace/process/samantha-33B/HF wikitext2 --wbits 4 --true-sequential --act-order --save_safetensors /workspace/process/Samantha-33B-GPTQ-4bit.act-order.safetensors
72
  ```
73
  ## Want to support my work?
74