RichardErkhov commited on
Commit
83a6d18
1 Parent(s): 5bda6c6

uploaded readme

Browse files
Files changed (1) hide show
  1. README.md +209 -0
README.md ADDED
@@ -0,0 +1,209 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ Quantization made by Richard Erkhov.
2
+
3
+ [Github](https://github.com/RichardErkhov)
4
+
5
+ [Discord](https://discord.gg/pvy7H8DZMG)
6
+
7
+ [Request more models](https://github.com/RichardErkhov/quant_request)
8
+
9
+
10
+ Llama-3-Yggdrasil-2.0-8B - GGUF
11
+ - Model creator: https://huggingface.co/Locutusque/
12
+ - Original model: https://huggingface.co/Locutusque/Llama-3-Yggdrasil-2.0-8B/
13
+
14
+
15
+ | Name | Quant method | Size |
16
+ | ---- | ---- | ---- |
17
+ | [Llama-3-Yggdrasil-2.0-8B.Q2_K.gguf](https://huggingface.co/RichardErkhov/Locutusque_-_Llama-3-Yggdrasil-2.0-8B-gguf/blob/main/Llama-3-Yggdrasil-2.0-8B.Q2_K.gguf) | Q2_K | 2.96GB |
18
+ | [Llama-3-Yggdrasil-2.0-8B.IQ3_XS.gguf](https://huggingface.co/RichardErkhov/Locutusque_-_Llama-3-Yggdrasil-2.0-8B-gguf/blob/main/Llama-3-Yggdrasil-2.0-8B.IQ3_XS.gguf) | IQ3_XS | 3.28GB |
19
+ | [Llama-3-Yggdrasil-2.0-8B.IQ3_S.gguf](https://huggingface.co/RichardErkhov/Locutusque_-_Llama-3-Yggdrasil-2.0-8B-gguf/blob/main/Llama-3-Yggdrasil-2.0-8B.IQ3_S.gguf) | IQ3_S | 3.43GB |
20
+ | [Llama-3-Yggdrasil-2.0-8B.Q3_K_S.gguf](https://huggingface.co/RichardErkhov/Locutusque_-_Llama-3-Yggdrasil-2.0-8B-gguf/blob/main/Llama-3-Yggdrasil-2.0-8B.Q3_K_S.gguf) | Q3_K_S | 3.41GB |
21
+ | [Llama-3-Yggdrasil-2.0-8B.IQ3_M.gguf](https://huggingface.co/RichardErkhov/Locutusque_-_Llama-3-Yggdrasil-2.0-8B-gguf/blob/main/Llama-3-Yggdrasil-2.0-8B.IQ3_M.gguf) | IQ3_M | 3.52GB |
22
+ | [Llama-3-Yggdrasil-2.0-8B.Q3_K.gguf](https://huggingface.co/RichardErkhov/Locutusque_-_Llama-3-Yggdrasil-2.0-8B-gguf/blob/main/Llama-3-Yggdrasil-2.0-8B.Q3_K.gguf) | Q3_K | 3.74GB |
23
+ | [Llama-3-Yggdrasil-2.0-8B.Q3_K_M.gguf](https://huggingface.co/RichardErkhov/Locutusque_-_Llama-3-Yggdrasil-2.0-8B-gguf/blob/main/Llama-3-Yggdrasil-2.0-8B.Q3_K_M.gguf) | Q3_K_M | 3.74GB |
24
+ | [Llama-3-Yggdrasil-2.0-8B.Q3_K_L.gguf](https://huggingface.co/RichardErkhov/Locutusque_-_Llama-3-Yggdrasil-2.0-8B-gguf/blob/main/Llama-3-Yggdrasil-2.0-8B.Q3_K_L.gguf) | Q3_K_L | 4.03GB |
25
+ | [Llama-3-Yggdrasil-2.0-8B.IQ4_XS.gguf](https://huggingface.co/RichardErkhov/Locutusque_-_Llama-3-Yggdrasil-2.0-8B-gguf/blob/main/Llama-3-Yggdrasil-2.0-8B.IQ4_XS.gguf) | IQ4_XS | 4.18GB |
26
+ | [Llama-3-Yggdrasil-2.0-8B.Q4_0.gguf](https://huggingface.co/RichardErkhov/Locutusque_-_Llama-3-Yggdrasil-2.0-8B-gguf/blob/main/Llama-3-Yggdrasil-2.0-8B.Q4_0.gguf) | Q4_0 | 4.34GB |
27
+ | [Llama-3-Yggdrasil-2.0-8B.IQ4_NL.gguf](https://huggingface.co/RichardErkhov/Locutusque_-_Llama-3-Yggdrasil-2.0-8B-gguf/blob/main/Llama-3-Yggdrasil-2.0-8B.IQ4_NL.gguf) | IQ4_NL | 4.38GB |
28
+ | [Llama-3-Yggdrasil-2.0-8B.Q4_K_S.gguf](https://huggingface.co/RichardErkhov/Locutusque_-_Llama-3-Yggdrasil-2.0-8B-gguf/blob/main/Llama-3-Yggdrasil-2.0-8B.Q4_K_S.gguf) | Q4_K_S | 4.37GB |
29
+ | [Llama-3-Yggdrasil-2.0-8B.Q4_K.gguf](https://huggingface.co/RichardErkhov/Locutusque_-_Llama-3-Yggdrasil-2.0-8B-gguf/blob/main/Llama-3-Yggdrasil-2.0-8B.Q4_K.gguf) | Q4_K | 4.58GB |
30
+ | [Llama-3-Yggdrasil-2.0-8B.Q4_K_M.gguf](https://huggingface.co/RichardErkhov/Locutusque_-_Llama-3-Yggdrasil-2.0-8B-gguf/blob/main/Llama-3-Yggdrasil-2.0-8B.Q4_K_M.gguf) | Q4_K_M | 4.58GB |
31
+ | [Llama-3-Yggdrasil-2.0-8B.Q4_1.gguf](https://huggingface.co/RichardErkhov/Locutusque_-_Llama-3-Yggdrasil-2.0-8B-gguf/blob/main/Llama-3-Yggdrasil-2.0-8B.Q4_1.gguf) | Q4_1 | 4.78GB |
32
+ | [Llama-3-Yggdrasil-2.0-8B.Q5_0.gguf](https://huggingface.co/RichardErkhov/Locutusque_-_Llama-3-Yggdrasil-2.0-8B-gguf/blob/main/Llama-3-Yggdrasil-2.0-8B.Q5_0.gguf) | Q5_0 | 5.21GB |
33
+ | [Llama-3-Yggdrasil-2.0-8B.Q5_K_S.gguf](https://huggingface.co/RichardErkhov/Locutusque_-_Llama-3-Yggdrasil-2.0-8B-gguf/blob/main/Llama-3-Yggdrasil-2.0-8B.Q5_K_S.gguf) | Q5_K_S | 5.21GB |
34
+ | [Llama-3-Yggdrasil-2.0-8B.Q5_K.gguf](https://huggingface.co/RichardErkhov/Locutusque_-_Llama-3-Yggdrasil-2.0-8B-gguf/blob/main/Llama-3-Yggdrasil-2.0-8B.Q5_K.gguf) | Q5_K | 5.34GB |
35
+ | [Llama-3-Yggdrasil-2.0-8B.Q5_K_M.gguf](https://huggingface.co/RichardErkhov/Locutusque_-_Llama-3-Yggdrasil-2.0-8B-gguf/blob/main/Llama-3-Yggdrasil-2.0-8B.Q5_K_M.gguf) | Q5_K_M | 5.34GB |
36
+ | [Llama-3-Yggdrasil-2.0-8B.Q5_1.gguf](https://huggingface.co/RichardErkhov/Locutusque_-_Llama-3-Yggdrasil-2.0-8B-gguf/blob/main/Llama-3-Yggdrasil-2.0-8B.Q5_1.gguf) | Q5_1 | 5.65GB |
37
+ | [Llama-3-Yggdrasil-2.0-8B.Q6_K.gguf](https://huggingface.co/RichardErkhov/Locutusque_-_Llama-3-Yggdrasil-2.0-8B-gguf/blob/main/Llama-3-Yggdrasil-2.0-8B.Q6_K.gguf) | Q6_K | 6.14GB |
38
+ | [Llama-3-Yggdrasil-2.0-8B.Q8_0.gguf](https://huggingface.co/RichardErkhov/Locutusque_-_Llama-3-Yggdrasil-2.0-8B-gguf/blob/main/Llama-3-Yggdrasil-2.0-8B.Q8_0.gguf) | Q8_0 | 7.95GB |
39
+
40
+
41
+
42
+
43
+ Original model description:
44
+ ---
45
+ library_name: transformers
46
+ tags:
47
+ - mergekit
48
+ - merge
49
+ base_model:
50
+ - Locutusque/Llama-3-NeuralHercules-5.0-8B
51
+ - NousResearch/Meta-Llama-3-8B
52
+ - NousResearch/Hermes-2-Theta-Llama-3-8B
53
+ - Locutusque/llama-3-neural-chat-v2.2-8b
54
+ model-index:
55
+ - name: Llama-3-Yggdrasil-2.0-8B
56
+ results:
57
+ - task:
58
+ type: text-generation
59
+ name: Text Generation
60
+ dataset:
61
+ name: IFEval (0-Shot)
62
+ type: HuggingFaceH4/ifeval
63
+ args:
64
+ num_few_shot: 0
65
+ metrics:
66
+ - type: inst_level_strict_acc and prompt_level_strict_acc
67
+ value: 53.71
68
+ name: strict accuracy
69
+ source:
70
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=Locutusque/Llama-3-Yggdrasil-2.0-8B
71
+ name: Open LLM Leaderboard
72
+ - task:
73
+ type: text-generation
74
+ name: Text Generation
75
+ dataset:
76
+ name: BBH (3-Shot)
77
+ type: BBH
78
+ args:
79
+ num_few_shot: 3
80
+ metrics:
81
+ - type: acc_norm
82
+ value: 26.92
83
+ name: normalized accuracy
84
+ source:
85
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=Locutusque/Llama-3-Yggdrasil-2.0-8B
86
+ name: Open LLM Leaderboard
87
+ - task:
88
+ type: text-generation
89
+ name: Text Generation
90
+ dataset:
91
+ name: MATH Lvl 5 (4-Shot)
92
+ type: hendrycks/competition_math
93
+ args:
94
+ num_few_shot: 4
95
+ metrics:
96
+ - type: exact_match
97
+ value: 6.87
98
+ name: exact match
99
+ source:
100
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=Locutusque/Llama-3-Yggdrasil-2.0-8B
101
+ name: Open LLM Leaderboard
102
+ - task:
103
+ type: text-generation
104
+ name: Text Generation
105
+ dataset:
106
+ name: GPQA (0-shot)
107
+ type: Idavidrein/gpqa
108
+ args:
109
+ num_few_shot: 0
110
+ metrics:
111
+ - type: acc_norm
112
+ value: 1.68
113
+ name: acc_norm
114
+ source:
115
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=Locutusque/Llama-3-Yggdrasil-2.0-8B
116
+ name: Open LLM Leaderboard
117
+ - task:
118
+ type: text-generation
119
+ name: Text Generation
120
+ dataset:
121
+ name: MuSR (0-shot)
122
+ type: TAUR-Lab/MuSR
123
+ args:
124
+ num_few_shot: 0
125
+ metrics:
126
+ - type: acc_norm
127
+ value: 8.07
128
+ name: acc_norm
129
+ source:
130
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=Locutusque/Llama-3-Yggdrasil-2.0-8B
131
+ name: Open LLM Leaderboard
132
+ - task:
133
+ type: text-generation
134
+ name: Text Generation
135
+ dataset:
136
+ name: MMLU-PRO (5-shot)
137
+ type: TIGER-Lab/MMLU-Pro
138
+ config: main
139
+ split: test
140
+ args:
141
+ num_few_shot: 5
142
+ metrics:
143
+ - type: acc
144
+ value: 24.07
145
+ name: accuracy
146
+ source:
147
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=Locutusque/Llama-3-Yggdrasil-2.0-8B
148
+ name: Open LLM Leaderboard
149
+ ---
150
+ # merge
151
+
152
+ This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
153
+
154
+ ## Merge Details
155
+ ### Merge Method
156
+
157
+ This model was merged using the [DARE](https://arxiv.org/abs/2311.03099) [TIES](https://arxiv.org/abs/2306.01708) merge method using [NousResearch/Meta-Llama-3-8B](https://huggingface.co/NousResearch/Meta-Llama-3-8B) as a base.
158
+
159
+ ### Models Merged
160
+
161
+ The following models were included in the merge:
162
+ * [Locutusque/Llama-3-NeuralHercules-5.0-8B](https://huggingface.co/Locutusque/Llama-3-NeuralHercules-5.0-8B)
163
+ * [NousResearch/Hermes-2-Theta-Llama-3-8B](https://huggingface.co/NousResearch/Hermes-2-Theta-Llama-3-8B)
164
+ * [Locutusque/llama-3-neural-chat-v2.2-8b](https://huggingface.co/Locutusque/llama-3-neural-chat-v2.2-8b)
165
+
166
+ ### Configuration
167
+
168
+ The following YAML configuration was used to produce this model:
169
+
170
+ ```yaml
171
+ models:
172
+ - model: NousResearch/Meta-Llama-3-8B
173
+ # No parameters necessary for base model
174
+ - model: NousResearch/Hermes-2-Theta-Llama-3-8B
175
+ parameters:
176
+ density: 0.6
177
+ weight: 0.55
178
+ - model: Locutusque/llama-3-neural-chat-v2.2-8b
179
+ parameters:
180
+ density: 0.55
181
+ weight: 0.4
182
+ - model: Locutusque/Llama-3-NeuralHercules-5.0-8B
183
+ parameters:
184
+ density: 0.65
185
+ weight: 0.6
186
+
187
+ merge_method: dare_ties
188
+ base_model: NousResearch/Meta-Llama-3-8B
189
+ parameters:
190
+ int8_mask: true
191
+ dtype: bfloat16
192
+
193
+ ```
194
+
195
+ # [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard)
196
+ Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_Locutusque__Llama-3-Yggdrasil-2.0-8B)
197
+
198
+ | Metric |Value|
199
+ |-------------------|----:|
200
+ |Avg. |20.22|
201
+ |IFEval (0-Shot) |53.71|
202
+ |BBH (3-Shot) |26.92|
203
+ |MATH Lvl 5 (4-Shot)| 6.87|
204
+ |GPQA (0-shot) | 1.68|
205
+ |MuSR (0-shot) | 8.07|
206
+ |MMLU-PRO (5-shot) |24.07|
207
+
208
+
209
+