removing ARM quants
Browse files- DeepSeek-V2.5-Q4_0_8_8/DeepSeek-V2.5-Q4_0_8_8-00001-of-00004.gguf +0 -3
- DeepSeek-V2.5-Q4_0_8_8/DeepSeek-V2.5-Q4_0_8_8-00002-of-00004.gguf +0 -3
- DeepSeek-V2.5-Q4_0_8_8/DeepSeek-V2.5-Q4_0_8_8-00003-of-00004.gguf +0 -3
- DeepSeek-V2.5-Q4_0_8_8/DeepSeek-V2.5-Q4_0_8_8-00004-of-00004.gguf +0 -3
- README.md +0 -7
DeepSeek-V2.5-Q4_0_8_8/DeepSeek-V2.5-Q4_0_8_8-00001-of-00004.gguf
DELETED
@@ -1,3 +0,0 @@
|
|
1 |
-
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:5eebd6e6dfa8f5770fe170397854f090a6196f127a8568c0ae6848c743a39868
|
3 |
-
size 39658925312
|
|
|
|
|
|
|
|
DeepSeek-V2.5-Q4_0_8_8/DeepSeek-V2.5-Q4_0_8_8-00002-of-00004.gguf
DELETED
@@ -1,3 +0,0 @@
|
|
1 |
-
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:648bef998337f9868e04cd99414037370bd180d5f2f5beb840f80e691c4dd3cf
|
3 |
-
size 39675124704
|
|
|
|
|
|
|
|
DeepSeek-V2.5-Q4_0_8_8/DeepSeek-V2.5-Q4_0_8_8-00003-of-00004.gguf
DELETED
@@ -1,3 +0,0 @@
|
|
1 |
-
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:53ca9a75c289504ca24cb1af67bd1a4c5b16ba79d7ba09b0cea89f8ca94db18b
|
3 |
-
size 39447549440
|
|
|
|
|
|
|
|
DeepSeek-V2.5-Q4_0_8_8/DeepSeek-V2.5-Q4_0_8_8-00004-of-00004.gguf
DELETED
@@ -1,3 +0,0 @@
|
|
1 |
-
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:58cf99d2f5488a43247bc3ef5e1f543fabc304c9e15f61cc5bec616e7fce7971
|
3 |
-
size 14130858304
|
|
|
|
|
|
|
|
README.md
CHANGED
@@ -32,7 +32,6 @@ Run them in [LM Studio](https://lmstudio.ai/)
|
|
32 |
| [DeepSeek-V2.5-Q5_K_M.gguf](https://huggingface.co/bartowski/DeepSeek-V2.5-GGUF/tree/main/DeepSeek-V2.5-Q5_K_M) | Q5_K_M | 167.22GB | true | High quality, *recommended*. |
|
33 |
| [DeepSeek-V2.5-Q4_K_M.gguf](https://huggingface.co/bartowski/DeepSeek-V2.5-GGUF/tree/main/DeepSeek-V2.5-Q4_K_M) | Q4_K_M | 142.45GB | true | Good quality, default size for must use cases, *recommended*. |
|
34 |
| [DeepSeek-V2.5-Q4_0.gguf](https://huggingface.co/bartowski/DeepSeek-V2.5-GGUF/tree/main/DeepSeek-V2.5-Q4_0) | Q4_0 | 133.39GB | true | Legacy format, generally not worth using over similarly sized formats |
|
35 |
-
| [DeepSeek-V2.5-Q4_0_8_8.gguf](https://huggingface.co/bartowski/DeepSeek-V2.5-GGUF/tree/main/DeepSeek-V2.5-Q4_0_8_8) | Q4_0_8_8 | 132.91GB | true | Optimized for ARM inference. Requires 'sve' support (see link below). |
|
36 |
| [DeepSeek-V2.5-IQ4_XS.gguf](https://huggingface.co/bartowski/DeepSeek-V2.5-GGUF/tree/main/DeepSeek-V2.5-IQ4_XS) | IQ4_XS | 125.56GB | true | Decent quality, smaller than Q4_K_S with similar performance, *recommended*. |
|
37 |
| [DeepSeek-V2.5-Q3_K_XL.gguf](https://huggingface.co/bartowski/DeepSeek-V2.5-GGUF/tree/main/DeepSeek-V2.5-Q3_K_XL) | Q3_K_XL | 122.83GB | true | Uses Q8_0 for embed and output weights. Lower quality but usable, good for low RAM availability. |
|
38 |
| [DeepSeek-V2.5-Q3_K_L.gguf](https://huggingface.co/bartowski/DeepSeek-V2.5-GGUF/tree/main/DeepSeek-V2.5-Q3_K_L) | Q3_K_L | 122.37GB | true | Lower quality but usable, good for low RAM availability. |
|
@@ -76,12 +75,6 @@ huggingface-cli download bartowski/DeepSeek-V2.5-GGUF --include "DeepSeek-V2.5-Q
|
|
76 |
|
77 |
You can either specify a new local-dir (DeepSeek-V2.5-Q8_0) or download them all in place (./)
|
78 |
|
79 |
-
## Q4_0_X_X
|
80 |
-
|
81 |
-
If you're using an ARM chip, the Q4_0_X_X quants will have a substantial speedup. Check out Q4_0_4_4 speed comparisons [on the original pull request](https://github.com/ggerganov/llama.cpp/pull/5780#pullrequestreview-21657544660)
|
82 |
-
|
83 |
-
To check which one would work best for your ARM chip, you can check [AArch64 SoC features](https://gpages.juszkiewicz.com.pl/arm-socs-table/arm-socs.html) (thanks EloyOn!).
|
84 |
-
|
85 |
## Which file should I choose?
|
86 |
|
87 |
A great write up with charts showing various performances is provided by Artefact2 [here](https://gist.github.com/Artefact2/b5f810600771265fc1e39442288e8ec9)
|
|
|
32 |
| [DeepSeek-V2.5-Q5_K_M.gguf](https://huggingface.co/bartowski/DeepSeek-V2.5-GGUF/tree/main/DeepSeek-V2.5-Q5_K_M) | Q5_K_M | 167.22GB | true | High quality, *recommended*. |
|
33 |
| [DeepSeek-V2.5-Q4_K_M.gguf](https://huggingface.co/bartowski/DeepSeek-V2.5-GGUF/tree/main/DeepSeek-V2.5-Q4_K_M) | Q4_K_M | 142.45GB | true | Good quality, default size for must use cases, *recommended*. |
|
34 |
| [DeepSeek-V2.5-Q4_0.gguf](https://huggingface.co/bartowski/DeepSeek-V2.5-GGUF/tree/main/DeepSeek-V2.5-Q4_0) | Q4_0 | 133.39GB | true | Legacy format, generally not worth using over similarly sized formats |
|
|
|
35 |
| [DeepSeek-V2.5-IQ4_XS.gguf](https://huggingface.co/bartowski/DeepSeek-V2.5-GGUF/tree/main/DeepSeek-V2.5-IQ4_XS) | IQ4_XS | 125.56GB | true | Decent quality, smaller than Q4_K_S with similar performance, *recommended*. |
|
36 |
| [DeepSeek-V2.5-Q3_K_XL.gguf](https://huggingface.co/bartowski/DeepSeek-V2.5-GGUF/tree/main/DeepSeek-V2.5-Q3_K_XL) | Q3_K_XL | 122.83GB | true | Uses Q8_0 for embed and output weights. Lower quality but usable, good for low RAM availability. |
|
37 |
| [DeepSeek-V2.5-Q3_K_L.gguf](https://huggingface.co/bartowski/DeepSeek-V2.5-GGUF/tree/main/DeepSeek-V2.5-Q3_K_L) | Q3_K_L | 122.37GB | true | Lower quality but usable, good for low RAM availability. |
|
|
|
75 |
|
76 |
You can either specify a new local-dir (DeepSeek-V2.5-Q8_0) or download them all in place (./)
|
77 |
|
|
|
|
|
|
|
|
|
|
|
|
|
78 |
## Which file should I choose?
|
79 |
|
80 |
A great write up with charts showing various performances is provided by Artefact2 [here](https://gist.github.com/Artefact2/b5f810600771265fc1e39442288e8ec9)
|