File size: 4,308 Bytes
eef4015 40855b2 46536f4 44b5b71 eef4015 488073a 44b5b71 340a666 44b5b71 309bc4d 44b5b71 384ecd5 077ffd1 c8ed6eb 077ffd1 c8ed6eb 077ffd1 44b5b71 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 |
---
base_model: v000000/MN-12B-Estrella-v1
library_name: transformers
tags:
- mergekit
- merge
- mistral
- llama-cpp
---
> [!WARNING]
> **Temperature:**<br>
> Mistral Nemo likes low temperature between 0.3-0.5
<b>GGUF version.</b>
This model was converted to GGUF format from [`v000000/MN-12B-Estrella-v1`](https://huggingface.co/v000000/MN-12B-Estrella-v1) using llama.cpp'
Refer to the [original model card](https://huggingface.co/v000000/MN-12B-Estrella-v1) for more details on the model.
Mistral-Nemo-2407-12B-Estrella-v1-Q6_K-GGUF
---------------------------------------------------------------------
![image/png](https://cdn-uploads.huggingface.co/production/uploads/64f74b6e6389380c77562762/MyveknmJhuj43YrukIDAU.png)
RP Model. Seems coherent and concise but also creative. Big merge using new DELLA technique.
<b>Prompt Format: Seems best with "Mistral Instruct" but ChatML might also work.</b>
```
[INST] System Message [/INST]
[INST] Name: Let's get started. Please respond based on the information and instructions provided above. [/INST]
<s>[INST] Name: What is your favourite condiment? [/INST]
AssistantName: Well, I'm quite partial to a good squeeze of fresh lemon juice. It adds just the right amount of zesty flavour to whatever I'm cooking up in the kitchen!</s>
[INST] Name: Do you have mayonnaise recipes? [/INST]
```
----------------------------------------------------------------------
## merge
This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
## Merge Details
### Merge Method
This model was merged with a multi-step method using the <b>DELLA</b>, <b>DELLA_LINEAR</b> and <b>SLERP</b> merge algorithms.
### Models Merged
The following models were included in the merge:
* [nothingiisreal/MN-12B-Celeste-V1.9](https://huggingface.co/nothingiisreal/MN-12B-Celeste-V1.9)
* [shuttleai/shuttle-2.5-mini](https://huggingface.co/shuttleai/shuttle-2.5-mini)
* [anthracite-org/magnum-12b-v2](https://huggingface.co/anthracite-org/magnum-12b-v2)
* [Sao10K/MN-12B-Lyra-v1](https://huggingface.co/Sao10K/MN-12B-Lyra-v1)
* [unsloth/Mistral-Nemo-Instruct-2407](https://huggingface.co/unsloth/Mistral-Nemo-Instruct-2407)
* [NeverSleep/Lumimaid-v0.2-12B](https://huggingface.co/NeverSleep/Lumimaid-v0.2-12B)
* [UsernameJustAnother/Nemo-12B-Marlin-v5](https://huggingface.co/UsernameJustAnother/Nemo-12B-Marlin-v5)
* [BeaverAI/mistral-doryV2-12b](https://huggingface.co/BeaverAI/mistral-doryV2-12b)
* [invisietch/Atlantis-v0.1-12B](https://huggingface.co/invisietch/Atlantis-v0.1-12B)
### Configuration
The following YAML configuration was used to produce this model:
```yaml
#Step 1 (Part1)
models:
- model: Sao10K/MN-12B-Lyra-v1
parameters:
weight: 0.15
density: 0.77
- model: shuttleai/shuttle-2.5-mini
parameters:
weight: 0.20
density: 0.78
- model: anthracite-org/magnum-12b-v2
parameters:
weight: 0.35
density: 0.85
- model: nothingiisreal/MN-12B-Celeste-V1.9
parameters:
weight: 0.55
density: 0.90
merge_method: della
base_model: Sao10K/MN-12B-Lyra-v1
parameters:
int8_mask: true
epsilon: 0.05
lambda: 1
dtype: bfloat16
#Step 2 (Part2)
models:
- model: BeaverAI/mistral-doryV2-12b
parameters:
weight: 0.10
density: 0.4
- model: unsloth/Mistral-Nemo-Instruct-2407
parameters:
weight: 0.20
density: 0.4
- model: UsernameJustAnother/Nemo-12B-Marlin-v5
parameters:
weight: 0.25
density: 0.5
- model: invisietch/Atlantis-v0.1-12B
parameters:
weight: 0.3
density: 0.5
- model: NeverSleep/Lumimaid-v0.2-12B
parameters:
weight: 0.4
density: 0.8
merge_method: della_linear
base_model: anthracite-org/magnum-12b-v2
parameters:
int8_mask: true
epsilon: 0.05
lambda: 1
dtype: bfloat16
#Step 3 (Estrella)
slices:
- sources:
- model: v000000/MN-12B-Part2
layer_range: [0, 40]
- model: v000000/MN-12B-Part1
layer_range: [0, 40]
merge_method: slerp
base_model: v000000/MN-12B-Part1
parameters: #smooth gradient prio part1
t:
- filter: self_attn
value: [0, 0.5, 0.3, 0.7, 0.6, 0.1, 0.6, 0.3, 0.8, 0.5]
- filter: mlp
value: [0, 0.5, 0.4, 0.3, 0, 0.3, 0.4, 0.7, 0.2, 0.5]
- value: 0.5
dtype: bfloat16
``` |