File size: 6,399 Bytes
3aaba32 025a422 3aaba32 470c09b a343915 3aaba32 ccb1bac 3aaba32 025a422 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 |
---
license: llama3
library_name: transformers
tags:
- nsfw
- not-for-all-audiences
- llama-3
- text-generation-inference
- moe
- mergekit
- merge
model-index:
- name: Llama-Salad-4x8B-V3
results:
- task:
type: text-generation
name: Text Generation
dataset:
name: IFEval (0-Shot)
type: HuggingFaceH4/ifeval
args:
num_few_shot: 0
metrics:
- type: inst_level_strict_acc and prompt_level_strict_acc
value: 66.54
name: strict accuracy
source:
url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=HiroseKoichi/Llama-Salad-4x8B-V3
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: BBH (3-Shot)
type: BBH
args:
num_few_shot: 3
metrics:
- type: acc_norm
value: 31.93
name: normalized accuracy
source:
url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=HiroseKoichi/Llama-Salad-4x8B-V3
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: MATH Lvl 5 (4-Shot)
type: hendrycks/competition_math
args:
num_few_shot: 4
metrics:
- type: exact_match
value: 8.53
name: exact match
source:
url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=HiroseKoichi/Llama-Salad-4x8B-V3
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: GPQA (0-shot)
type: Idavidrein/gpqa
args:
num_few_shot: 0
metrics:
- type: acc_norm
value: 7.05
name: acc_norm
source:
url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=HiroseKoichi/Llama-Salad-4x8B-V3
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: MuSR (0-shot)
type: TAUR-Lab/MuSR
args:
num_few_shot: 0
metrics:
- type: acc_norm
value: 6.45
name: acc_norm
source:
url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=HiroseKoichi/Llama-Salad-4x8B-V3
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: MMLU-PRO (5-shot)
type: TIGER-Lab/MMLU-Pro
config: main
split: test
args:
num_few_shot: 5
metrics:
- type: acc
value: 27.98
name: accuracy
source:
url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=HiroseKoichi/Llama-Salad-4x8B-V3
name: Open LLM Leaderboard
---
# Llama-Salad-4x8B-V3
Changes in V3:
- Uses `L3-8B-Stheno-v3.2` as the base model instead of `Meta-Llama-3-8B-Instruct`
- Removed `opus-v1.2-llama-3-8b-instruct-run3.5-epoch2.5` and added `Einstein-v6.1-Llama3-8B`
- Swapped `Llama-3-Soliloquy-8B-v2` for `L3-8B-Stheno-v3.2`
I was clearly wrong when I said V2 would be difficult to improve on, because V3 is significantly better in just about every aspect. Stheno-v3.2 fixed all of the issues present in Stheno-v3.1, making it my favorite roleplay model and the best base model for llama-3 MoE merges.
The one thing I do want to improve on is finding a better conversational model than Meta-Llama-3-8B-Instruct; it's good for that use case, but I'm sure there's a better one out there. I tried using llama-3-cat-8b-instruct-v1, but it absolutely tanked the model's situational awareness and kept making blatantly contradictory statements.
# Quantization Formats
**GGUF**
- Static:
- https://huggingface.co/mradermacher/Llama-Salad-4x8B-V3-GGUF
- Imatrix:
- https://huggingface.co/mradermacher/Llama-Salad-4x8B-V3-i1-GGUF
# Details
- **License**: [llama3](https://llama.meta.com/llama3/license/)
- **Instruct Format**: [llama-3](https://llama.meta.com/docs/model-cards-and-prompt-formats/meta-llama-3/)
- **Context Size**: 8K
## Models Used
- [L3-8B-Stheno-v3.2](https://huggingface.co/Sao10K/L3-8B-Stheno-v3.2)
- [Meta-Llama-3-8B-Instruct](https://huggingface.co/NousResearch/Meta-Llama-3-8B-Instruct)
- [Llama-3-8B-Synthia-v3.5](https://huggingface.co/migtissera/Llama-3-8B-Synthia-v3.5)
- [Einstein-v6.1-Llama3-8B](https://huggingface.co/Weyaxi/Einstein-v6.1-Llama3-8B)
## Merge Config
```yaml
base_model: Sao10K/L3-8B-Stheno-v3.2
gate_mode: hidden
dtype: bfloat16
experts_per_token: 2
experts:
- source_model: NousResearch/Meta-Llama-3-8B-Instruct
positive_prompts:
- "chat"
- "conversation"
- source_model: Weyaxi/Einstein-v6.1-Llama3-8B
positive_prompts:
- "science"
- "physics"
- "chemistry"
- "biology"
- "math"
- "step-by-step"
- "logical reasoning"
- "multilingual"
- "translation"
- "language translation"
- "foreign language"
negative_prompts:
- "programming language"
- source_model: migtissera/Llama-3-8B-Synthia-v3.5
positive_prompts:
- "summarize"
- "paraphrase"
- "list"
- "explain"
- "define"
- "analyze"
- "rephrase"
- "elaborate"
- "programming language"
- "JavaScript"
- "Python programming language"
- "Rust programming language"
- "C++ programming language"
- "GO programming language"
- "Ruby programming language"
- "Haskell programming language"
- "SQL query language"
- "CSS markup styling language"
- "code"
- source_model: Sao10K/L3-8B-Stheno-v3.2
positive_prompts:
- "characters"
- "scene"
- "roleplay"
- "erotic roleplay"
- "sexual fetish"
- "NSFW"
- "creative writing"
- "storytelling"
- "narration"
- "narrative setting"
- "narrative plot"
- "narrative exposition"
- "narrative theme"
- "narrative climax"
```
# [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard)
Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_HiroseKoichi__Llama-Salad-4x8B-V3)
| Metric |Value|
|-------------------|----:|
|Avg. |24.75|
|IFEval (0-Shot) |66.54|
|BBH (3-Shot) |31.93|
|MATH Lvl 5 (4-Shot)| 8.53|
|GPQA (0-shot) | 7.05|
|MuSR (0-shot) | 6.45|
|MMLU-PRO (5-shot) |27.98|
|