File size: 4,621 Bytes
9418fc8 9078482 9418fc8 e680ed7 940e144 e680ed7 940e144 e680ed7 15f4a19 e77f8ef 460c7b1 15f4a19 c5087e1 15f4a19 2133207 126be80 d279f53 126be80 cbf03f0 fb849f3 cbf03f0 fb849f3 cbf03f0 e680ed7 8f664df 8a12658 07e1f4f 8b7a6dc 07e1f4f 9418fc8 7e918b2 9418fc8 4665775 9418fc8 e7ac0f7 4665775 e7ac0f7 4665775 e7ac0f7 9418fc8 decbab1 9418fc8 e680ed7 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 |
---
base_model:
- v000000/MN-12B-Part1
- v000000/MN-12B-Part2
library_name: transformers
tags:
- mergekit
- merge
- mistral
---
<!DOCTYPE html>
<style>
h1 {
color: #327fa8; /* Red color */
font-size: 1.25em; /* Larger font size */
text-align: left; /* Center alignment */
text-shadow: 2px 2px 4px rgba(0, 0, 0, 0.5); /* Shadow effect */
background: linear-gradient(90deg, #327fa8, #fba8a8); /* Gradient background */
-webkit-background-clip: text; /* Clipping the background to text */
-webkit-text-fill-color: transparent; /* Making the text transparent */
}
</style>
<html lang="en">
<head>
</head>
<body>
> [!WARNING]
> **Temperature:**<br>
> Mistral Nemo likes low temperature between 0.3-0.5
Mistral-Nemo-2407-12B-Estrella-v1
---------------------------------------------------------------------
![image/png](https://cdn-uploads.huggingface.co/production/uploads/64f74b6e6389380c77562762/MyveknmJhuj43YrukIDAU.png)
RP Model. Seems coherent and concise but also creative. Big merge using new DELLA technique.
<b>Prompt Format: Seems best with "Mistral Instruct" but ChatML might also work.</b>
```
[INST] System Message [/INST]
[INST] Name: Let's get started. Please respond based on the information and instructions provided above. [/INST]
<s>[INST] Name: What is your favourite condiment? [/INST]
AssistantName: Well, I'm quite partial to a good squeeze of fresh lemon juice. It adds just the right amount of zesty flavour to whatever I'm cooking up in the kitchen!</s>
[INST] Name: Do you have mayonnaise recipes? [/INST]
```
# <h1>Quants</h1>
* [Q6_K GGUF](https://huggingface.co/v000000/MN-12B-Estrella-v1-Q6_K-GGUF)
----------------------------------------------------------------------
## merge
This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
## Merge Details
### Merge Method
This model was merged with a multi-step method using the <b>DELLA</b>, <b>DELLA_LINEAR</b> and <b>SLERP</b> merge algorithms.
### Models Merged
The following models were included in the merge:
* [nothingiisreal/MN-12B-Celeste-V1.9](https://huggingface.co/nothingiisreal/MN-12B-Celeste-V1.9)
* [shuttleai/shuttle-2.5-mini](https://huggingface.co/shuttleai/shuttle-2.5-mini)
* [anthracite-org/magnum-12b-v2](https://huggingface.co/anthracite-org/magnum-12b-v2)
* [Sao10K/MN-12B-Lyra-v1](https://huggingface.co/Sao10K/MN-12B-Lyra-v1)
* [unsloth/Mistral-Nemo-Instruct-2407](https://huggingface.co/unsloth/Mistral-Nemo-Instruct-2407)
* [NeverSleep/Lumimaid-v0.2-12B](https://huggingface.co/NeverSleep/Lumimaid-v0.2-12B)
* [UsernameJustAnother/Nemo-12B-Marlin-v5](https://huggingface.co/UsernameJustAnother/Nemo-12B-Marlin-v5)
* [BeaverAI/mistral-doryV2-12b](https://huggingface.co/BeaverAI/mistral-doryV2-12b)
* [invisietch/Atlantis-v0.1-12B](https://huggingface.co/invisietch/Atlantis-v0.1-12B)
### Configuration
The following YAML configuration was used to produce this model:
```yaml
#Step 1 (Part1)
models:
- model: Sao10K/MN-12B-Lyra-v1
parameters:
weight: 0.15
density: 0.77
- model: shuttleai/shuttle-2.5-mini
parameters:
weight: 0.20
density: 0.78
- model: anthracite-org/magnum-12b-v2
parameters:
weight: 0.35
density: 0.85
- model: nothingiisreal/MN-12B-Celeste-V1.9
parameters:
weight: 0.55
density: 0.90
merge_method: della
base_model: Sao10K/MN-12B-Lyra-v1
parameters:
int8_mask: true
epsilon: 0.05
lambda: 1
dtype: bfloat16
#Step 2 (Part2)
models:
- model: BeaverAI/mistral-doryV2-12b
parameters:
weight: 0.10
density: 0.4
- model: unsloth/Mistral-Nemo-Instruct-2407
parameters:
weight: 0.20
density: 0.4
- model: UsernameJustAnother/Nemo-12B-Marlin-v5
parameters:
weight: 0.25
density: 0.5
- model: invisietch/Atlantis-v0.1-12B
parameters:
weight: 0.3
density: 0.5
- model: NeverSleep/Lumimaid-v0.2-12B
parameters:
weight: 0.4
density: 0.8
merge_method: della_linear
base_model: anthracite-org/magnum-12b-v2
parameters:
int8_mask: true
epsilon: 0.05
lambda: 1
dtype: bfloat16
#Step 3 (Estrella)
slices:
- sources:
- model: v000000/MN-12B-Part2
layer_range: [0, 40]
- model: v000000/MN-12B-Part1
layer_range: [0, 40]
merge_method: slerp
base_model: v000000/MN-12B-Part1
parameters: #smooth gradient prio part1
t:
- filter: self_attn
value: [0, 0.5, 0.3, 0.7, 0.6, 0.1, 0.6, 0.3, 0.8, 0.5]
- filter: mlp
value: [0, 0.5, 0.4, 0.3, 0, 0.3, 0.4, 0.7, 0.2, 0.5]
- value: 0.5
dtype: bfloat16
```
</body> |