Arki05 commited on
Commit
8172181
1 Parent(s): c9972fc

New README.md

Browse files
Files changed (1) hide show
  1. README.md +25 -4
README.md CHANGED
@@ -1,12 +1,33 @@
1
  ---
2
  license: apache-2.0
3
  ---
 
4
 
5
- Unofficial GGUF Quantizations of Grok-1. Works with llama.cpp as of [PR- Add grok-1 support #6204](https://github.com/ggerganov/llama.cpp/pull/6204)
 
 
 
 
 
6
 
7
- The splits now use [PR: llama_model_loader: support multiple split/shard GGUFs](https://github.com/ggerganov/llama.cpp/pull/6187).
8
- Therefore, no merging using `gguf-split` is needed any more.
9
 
10
- Q2_K, Q4_K and Q6_K are Uploaded. More will follow. All current Quants are made without any importance Matrix.
11
 
 
12
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: apache-2.0
3
  ---
4
+ # Grok-1 GGUF Quantizations
5
 
6
+ > [!WARNING]
7
+ > As discovered by [@DgDev91](https://huggingface.co/Arki05/Grok-1-GGUF/discussions/8) there's a slight issue with file naming when using these Quant's with current llama.cpp.
8
+ >
9
+ > A fix is already provided by @phymbert in [#6192](https://github.com/ggerganov/llama.cpp/pull/6192).
10
+ >
11
+ > For ease of use i've created a branch ([Quick-Fix Branch](https://github.com/arki05/llama.cpp-grok/tree/quick-fix-grok-split)) that incorporates these fixes.
12
 
13
+ This repository contains unofficial GGUF Quantizations of Grok-1, compatible with `llama.cpp` as of [PR- Add grok-1 support #6204](https://github.com/ggerganov/llama.cpp/pull/6204).
 
14
 
15
+ ## Updates
16
 
17
+ - The splits have been updated to utilize the improvements from [PR: llama_model_loader: support multiple split/shard GGUFs](https://github.com/ggerganov/llama.cpp/pull/6187). As a result, manual merging with `gguf-split` is no longer required.
18
 
19
+ With this, there is no need to merge the split files before use. Just download all splits and run llama.cpp with the first split like you would previously. It'll detect the other splits and load them as well.
20
+
21
+
22
+ ## Available Quantizations
23
+
24
+ The following Quantizations are currently available for download:
25
+
26
+ | Quant | Split Files |
27
+ |-------|-------------|
28
+ | Q2_K | [split-1-of-9](https://huggingface.co/Arki05/Grok-1-GGUF/resolve/main/grok-1-Q2_K-split-00001-of-00009.gguf), [split-2-of-9](https://huggingface.co/Arki05/Grok-1-GGUF/resolve/main/grok-1-Q2_K-split-00002-of-00009.gguf), [split-3-of-9](https://huggingface.co/Arki05/Grok-1-GGUF/resolve/main/grok-1-Q2_K-split-00003-of-00009.gguf), [split-4-of-9](https://huggingface.co/Arki05/Grok-1-GGUF/resolve/main/grok-1-Q2_K-split-00004-of-00009.gguf), [split-5-of-9](https://huggingface.co/Arki05/Grok-1-GGUF/resolve/main/grok-1-Q2_K-split-00005-of-00009.gguf), [split-6-of-9](https://huggingface.co/Arki05/Grok-1-GGUF/resolve/main/grok-1-Q2_K-split-00006-of-00009.gguf), [split-7-of-9](https://huggingface.co/Arki05/Grok-1-GGUF/resolve/main/grok-1-Q2_K-split-00007-of-00009.gguf), [split-8-of-9](https://huggingface.co/Arki05/Grok-1-GGUF/resolve/main/grok-1-Q2_K-split-00008-of-00009.gguf), [split-9-of-9](https://huggingface.co/Arki05/Grok-1-GGUF/resolve/main/grok-1-Q2_K-split-00009-of-00009.gguf) |
29
+ | Q4_K | [split-1-of-9](https://huggingface.co/Arki05/Grok-1-GGUF/resolve/main/grok-1-Q4_K-split-00001-of-00009.gguf), [split-2-of-9](https://huggingface.co/Arki05/Grok-1-GGUF/resolve/main/grok-1-Q4_K-split-00002-of-00009.gguf), [split-3-of-9](https://huggingface.co/Arki05/Grok-1-GGUF/resolve/main/grok-1-Q4_K-split-00003-of-00009.gguf), [split-4-of-9](https://huggingface.co/Arki05/Grok-1-GGUF/resolve/main/grok-1-Q4_K-split-00004-of-00009.gguf), [split-5-of-9](https://huggingface.co/Arki05/Grok-1-GGUF/resolve/main/grok-1-Q4_K-split-00005-of-00009.gguf), [split-6-of-9](https://huggingface.co/Arki05/Grok-1-GGUF/resolve/main/grok-1-Q4_K-split-00006-of-00009.gguf), [split-7-of-9](https://huggingface.co/Arki05/Grok-1-GGUF/resolve/main/grok-1-Q4_K-split-00007-of-00009.gguf), [split-8-of-9](https://huggingface.co/Arki05/Grok-1-GGUF/resolve/main/grok-1-Q4_K-split-00008-of-00009.gguf), [split-9-of-9](https://huggingface.co/Arki05/Grok-1-GGUF/resolve/main/grok-1-Q4_K-split-00009-of-00009.gguf) |
30
+ | Q6_K | [split-1-of-9](https://huggingface.co/Arki05/Grok-1-GGUF/resolve/main/grok-1-Q6_K-split-00001-of-00009.gguf), [split-2-of-9](https://huggingface.co/Arki05/Grok-1-GGUF/resolve/main/grok-1-Q6_K-split-00002-of-00009.gguf), [split-3-of-9](https://huggingface.co/Arki05/Grok-1-GGUF/resolve/main/grok-1-Q6_K-split-00003-of-00009.gguf), [split-4-of-9](https://huggingface.co/Arki05/Grok-1-GGUF/resolve/main/grok-1-Q6_K-split-00004-of-00009.gguf), [split-5-of-9](https://huggingface.co/Arki05/Grok-1-GGUF/resolve/main/grok-1-Q6_K-split-00005-of-00009.gguf), [split-6-of-9](https://huggingface.co/Arki05/Grok-1-GGUF/resolve/main/grok-1-Q6_K-split-00006-of-00009.gguf), [split-7-of-9](https://huggingface.co/Arki05/Grok-1-GGUF/resolve/main/grok-1-Q6_K-split-00007-of-00009.gguf), [split-8-of-9](https://huggingface.co/Arki05/Grok-1-GGUF/resolve/main/grok-1-Q6_K-split-00008-of-00009.gguf), [split-9-of-9](https://huggingface.co/Arki05/Grok-1-GGUF/resolve/main/grok-1-Q6_K-split-00009-of-00009.gguf) |
31
+
32
+
33
+ *More Quantizations will be uploaded soon. All current Quants are created without any importance matrix.*