bartowski commited on
Commit
c5fe438
1 Parent(s): b2a1d28

removing ARM quants

Browse files
DeepSeek-V2.5-Q4_0_8_8/DeepSeek-V2.5-Q4_0_8_8-00001-of-00004.gguf DELETED
@@ -1,3 +0,0 @@
1
- version https://git-lfs.github.com/spec/v1
2
- oid sha256:5eebd6e6dfa8f5770fe170397854f090a6196f127a8568c0ae6848c743a39868
3
- size 39658925312
 
 
 
 
DeepSeek-V2.5-Q4_0_8_8/DeepSeek-V2.5-Q4_0_8_8-00002-of-00004.gguf DELETED
@@ -1,3 +0,0 @@
1
- version https://git-lfs.github.com/spec/v1
2
- oid sha256:648bef998337f9868e04cd99414037370bd180d5f2f5beb840f80e691c4dd3cf
3
- size 39675124704
 
 
 
 
DeepSeek-V2.5-Q4_0_8_8/DeepSeek-V2.5-Q4_0_8_8-00003-of-00004.gguf DELETED
@@ -1,3 +0,0 @@
1
- version https://git-lfs.github.com/spec/v1
2
- oid sha256:53ca9a75c289504ca24cb1af67bd1a4c5b16ba79d7ba09b0cea89f8ca94db18b
3
- size 39447549440
 
 
 
 
DeepSeek-V2.5-Q4_0_8_8/DeepSeek-V2.5-Q4_0_8_8-00004-of-00004.gguf DELETED
@@ -1,3 +0,0 @@
1
- version https://git-lfs.github.com/spec/v1
2
- oid sha256:58cf99d2f5488a43247bc3ef5e1f543fabc304c9e15f61cc5bec616e7fce7971
3
- size 14130858304
 
 
 
 
README.md CHANGED
@@ -32,7 +32,6 @@ Run them in [LM Studio](https://lmstudio.ai/)
32
  | [DeepSeek-V2.5-Q5_K_M.gguf](https://huggingface.co/bartowski/DeepSeek-V2.5-GGUF/tree/main/DeepSeek-V2.5-Q5_K_M) | Q5_K_M | 167.22GB | true | High quality, *recommended*. |
33
  | [DeepSeek-V2.5-Q4_K_M.gguf](https://huggingface.co/bartowski/DeepSeek-V2.5-GGUF/tree/main/DeepSeek-V2.5-Q4_K_M) | Q4_K_M | 142.45GB | true | Good quality, default size for must use cases, *recommended*. |
34
  | [DeepSeek-V2.5-Q4_0.gguf](https://huggingface.co/bartowski/DeepSeek-V2.5-GGUF/tree/main/DeepSeek-V2.5-Q4_0) | Q4_0 | 133.39GB | true | Legacy format, generally not worth using over similarly sized formats |
35
- | [DeepSeek-V2.5-Q4_0_8_8.gguf](https://huggingface.co/bartowski/DeepSeek-V2.5-GGUF/tree/main/DeepSeek-V2.5-Q4_0_8_8) | Q4_0_8_8 | 132.91GB | true | Optimized for ARM inference. Requires 'sve' support (see link below). |
36
  | [DeepSeek-V2.5-IQ4_XS.gguf](https://huggingface.co/bartowski/DeepSeek-V2.5-GGUF/tree/main/DeepSeek-V2.5-IQ4_XS) | IQ4_XS | 125.56GB | true | Decent quality, smaller than Q4_K_S with similar performance, *recommended*. |
37
  | [DeepSeek-V2.5-Q3_K_XL.gguf](https://huggingface.co/bartowski/DeepSeek-V2.5-GGUF/tree/main/DeepSeek-V2.5-Q3_K_XL) | Q3_K_XL | 122.83GB | true | Uses Q8_0 for embed and output weights. Lower quality but usable, good for low RAM availability. |
38
  | [DeepSeek-V2.5-Q3_K_L.gguf](https://huggingface.co/bartowski/DeepSeek-V2.5-GGUF/tree/main/DeepSeek-V2.5-Q3_K_L) | Q3_K_L | 122.37GB | true | Lower quality but usable, good for low RAM availability. |
@@ -76,12 +75,6 @@ huggingface-cli download bartowski/DeepSeek-V2.5-GGUF --include "DeepSeek-V2.5-Q
76
 
77
  You can either specify a new local-dir (DeepSeek-V2.5-Q8_0) or download them all in place (./)
78
 
79
- ## Q4_0_X_X
80
-
81
- If you're using an ARM chip, the Q4_0_X_X quants will have a substantial speedup. Check out Q4_0_4_4 speed comparisons [on the original pull request](https://github.com/ggerganov/llama.cpp/pull/5780#pullrequestreview-21657544660)
82
-
83
- To check which one would work best for your ARM chip, you can check [AArch64 SoC features](https://gpages.juszkiewicz.com.pl/arm-socs-table/arm-socs.html) (thanks EloyOn!).
84
-
85
  ## Which file should I choose?
86
 
87
  A great write up with charts showing various performances is provided by Artefact2 [here](https://gist.github.com/Artefact2/b5f810600771265fc1e39442288e8ec9)
 
32
  | [DeepSeek-V2.5-Q5_K_M.gguf](https://huggingface.co/bartowski/DeepSeek-V2.5-GGUF/tree/main/DeepSeek-V2.5-Q5_K_M) | Q5_K_M | 167.22GB | true | High quality, *recommended*. |
33
  | [DeepSeek-V2.5-Q4_K_M.gguf](https://huggingface.co/bartowski/DeepSeek-V2.5-GGUF/tree/main/DeepSeek-V2.5-Q4_K_M) | Q4_K_M | 142.45GB | true | Good quality, default size for must use cases, *recommended*. |
34
  | [DeepSeek-V2.5-Q4_0.gguf](https://huggingface.co/bartowski/DeepSeek-V2.5-GGUF/tree/main/DeepSeek-V2.5-Q4_0) | Q4_0 | 133.39GB | true | Legacy format, generally not worth using over similarly sized formats |
 
35
  | [DeepSeek-V2.5-IQ4_XS.gguf](https://huggingface.co/bartowski/DeepSeek-V2.5-GGUF/tree/main/DeepSeek-V2.5-IQ4_XS) | IQ4_XS | 125.56GB | true | Decent quality, smaller than Q4_K_S with similar performance, *recommended*. |
36
  | [DeepSeek-V2.5-Q3_K_XL.gguf](https://huggingface.co/bartowski/DeepSeek-V2.5-GGUF/tree/main/DeepSeek-V2.5-Q3_K_XL) | Q3_K_XL | 122.83GB | true | Uses Q8_0 for embed and output weights. Lower quality but usable, good for low RAM availability. |
37
  | [DeepSeek-V2.5-Q3_K_L.gguf](https://huggingface.co/bartowski/DeepSeek-V2.5-GGUF/tree/main/DeepSeek-V2.5-Q3_K_L) | Q3_K_L | 122.37GB | true | Lower quality but usable, good for low RAM availability. |
 
75
 
76
  You can either specify a new local-dir (DeepSeek-V2.5-Q8_0) or download them all in place (./)
77
 
 
 
 
 
 
 
78
  ## Which file should I choose?
79
 
80
  A great write up with charts showing various performances is provided by Artefact2 [here](https://gist.github.com/Artefact2/b5f810600771265fc1e39442288e8ec9)