Update README.md
Browse files
README.md
CHANGED
@@ -1,8 +1,7 @@
|
|
1 |
---
|
2 |
-
base_model:
|
3 |
-
- Undi95/Meta-Llama-3-8B-Instruct-hf
|
4 |
language:
|
5 |
- en
|
|
|
6 |
pipeline_tag: text-generation
|
7 |
tags:
|
8 |
- mergekit
|
@@ -17,7 +16,6 @@ license_name: llama3
|
|
17 |
license_link: LICENSE
|
18 |
extra_gated_prompt: >-
|
19 |
### META LLAMA 3 COMMUNITY LICENSE AGREEMENT
|
20 |
-
|
21 |
Meta Llama 3 Version Release Date: April 18, 2024
|
22 |
|
23 |
"Agreement" means the terms and conditions for use, reproduction, distribution and modification of the
|
@@ -190,15 +188,10 @@ extra_gated_fields:
|
|
190 |
extra_gated_description: The information you provide will be collected, stored, processed and shared in accordance with the [Meta Privacy Policy](https://www.facebook.com/privacy/policy/).
|
191 |
extra_gated_button_content: Submit
|
192 |
---
|
193 |
-
|
194 |
-
# Meta-Llama-3-11.5B-Instruct
|
195 |
|
196 |
This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
|
197 |
|
198 |
-
I had this idea at night that it would make sense to make a frankenmerge of Llama 3.. since we didn't get a 13B or 34B versions this time..
|
199 |
-
|
200 |
-
Here's the same thing but for the base model: [mpasila/Meta-Llama-3-11.5B](https://huggingface.co/mpasila/Meta-Llama-3-11.5B/)
|
201 |
-
|
202 |
## Merge Details
|
203 |
### Merge Method
|
204 |
|
@@ -207,7 +200,8 @@ This model was merged using the passthrough merge method.
|
|
207 |
### Models Merged
|
208 |
|
209 |
The following models were included in the merge:
|
210 |
-
* [
|
|
|
211 |
|
212 |
### Configuration
|
213 |
|
@@ -216,12 +210,14 @@ The following YAML configuration was used to produce this model:
|
|
216 |
```yaml
|
217 |
slices:
|
218 |
- sources:
|
219 |
-
|
220 |
-
|
221 |
- sources:
|
222 |
-
|
223 |
-
|
224 |
-
|
225 |
-
|
|
|
|
|
226 |
|
227 |
-
```
|
|
|
1 |
---
|
|
|
|
|
2 |
language:
|
3 |
- en
|
4 |
+
- ko
|
5 |
pipeline_tag: text-generation
|
6 |
tags:
|
7 |
- mergekit
|
|
|
16 |
license_link: LICENSE
|
17 |
extra_gated_prompt: >-
|
18 |
### META LLAMA 3 COMMUNITY LICENSE AGREEMENT
|
|
|
19 |
Meta Llama 3 Version Release Date: April 18, 2024
|
20 |
|
21 |
"Agreement" means the terms and conditions for use, reproduction, distribution and modification of the
|
|
|
188 |
extra_gated_description: The information you provide will be collected, stored, processed and shared in accordance with the [Meta Privacy Policy](https://www.facebook.com/privacy/policy/).
|
189 |
extra_gated_button_content: Submit
|
190 |
---
|
191 |
+
# KoDolphin
|
|
|
192 |
|
193 |
This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
|
194 |
|
|
|
|
|
|
|
|
|
195 |
## Merge Details
|
196 |
### Merge Method
|
197 |
|
|
|
200 |
### Models Merged
|
201 |
|
202 |
The following models were included in the merge:
|
203 |
+
* [beomi/Llama-3-Open-Ko-8B-Instruct-preview](https://huggingface.co/beomi/Llama-3-Open-Ko-8B-Instruct-preview)
|
204 |
+
* [cognitivecomputations/dolphin-2.9-llama3-8b](https://huggingface.co/cognitivecomputations/dolphin-2.9-llama3-8b)
|
205 |
|
206 |
### Configuration
|
207 |
|
|
|
210 |
```yaml
|
211 |
slices:
|
212 |
- sources:
|
213 |
+
- model: beomi/Llama-3-Open-Ko-8B-Instruct-preview
|
214 |
+
layer_range: [0, 20] # Use foundational and intermediate language processing layers in Korean
|
215 |
- sources:
|
216 |
+
- model: cognitivecomputations/dolphin-2.9-llama3-8b
|
217 |
+
layer_range: [15, 24] # Utilize advanced coding and domain-specific layers
|
218 |
+
|
219 |
+
merge_method: passthrough # Direct combination of layers without transformation
|
220 |
+
dtype: float16 # Efficient resource usage
|
221 |
+
|
222 |
|
223 |
+
```
|