nsfw

Not-For-All-Audiences

llama-3

text-generation-inference

Mixture of Experts

Model card Files Files and versions Community

Llama-Salad-4x8B-V3

File size: 6,399 Bytes

---
license: llama3
library_name: transformers
tags:
- nsfw
- not-for-all-audiences
- llama-3
- text-generation-inference
- moe
- mergekit
- merge
model-index:
- name: Llama-Salad-4x8B-V3
  results:
  - task:
      type: text-generation
      name: Text Generation
    dataset:
      name: IFEval (0-Shot)
      type: HuggingFaceH4/ifeval
      args:
        num_few_shot: 0
    metrics:
    - type: inst_level_strict_acc and prompt_level_strict_acc
      value: 66.54
      name: strict accuracy
    source:
      url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=HiroseKoichi/Llama-Salad-4x8B-V3
      name: Open LLM Leaderboard
  - task:
      type: text-generation
      name: Text Generation
    dataset:
      name: BBH (3-Shot)
      type: BBH
      args:
        num_few_shot: 3
    metrics:
    - type: acc_norm
      value: 31.93
      name: normalized accuracy
    source:
      url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=HiroseKoichi/Llama-Salad-4x8B-V3
      name: Open LLM Leaderboard
  - task:
      type: text-generation
      name: Text Generation
    dataset:
      name: MATH Lvl 5 (4-Shot)
      type: hendrycks/competition_math
      args:
        num_few_shot: 4
    metrics:
    - type: exact_match
      value: 8.53
      name: exact match
    source:
      url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=HiroseKoichi/Llama-Salad-4x8B-V3
      name: Open LLM Leaderboard
  - task:
      type: text-generation
      name: Text Generation
    dataset:
      name: GPQA (0-shot)
      type: Idavidrein/gpqa
      args:
        num_few_shot: 0
    metrics:
    - type: acc_norm
      value: 7.05
      name: acc_norm
    source:
      url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=HiroseKoichi/Llama-Salad-4x8B-V3
      name: Open LLM Leaderboard
  - task:
      type: text-generation
      name: Text Generation
    dataset:
      name: MuSR (0-shot)
      type: TAUR-Lab/MuSR
      args:
        num_few_shot: 0
    metrics:
    - type: acc_norm
      value: 6.45
      name: acc_norm
    source:
      url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=HiroseKoichi/Llama-Salad-4x8B-V3
      name: Open LLM Leaderboard
  - task:
      type: text-generation
      name: Text Generation
    dataset:
      name: MMLU-PRO (5-shot)
      type: TIGER-Lab/MMLU-Pro
      config: main
      split: test
      args:
        num_few_shot: 5
    metrics:
    - type: acc
      value: 27.98
      name: accuracy
    source:
      url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=HiroseKoichi/Llama-Salad-4x8B-V3
      name: Open LLM Leaderboard
---

# Llama-Salad-4x8B-V3
Changes in V3:
- Uses `L3-8B-Stheno-v3.2` as the base model instead of `Meta-Llama-3-8B-Instruct`
- Removed `opus-v1.2-llama-3-8b-instruct-run3.5-epoch2.5` and added `Einstein-v6.1-Llama3-8B`
- Swapped `Llama-3-Soliloquy-8B-v2` for `L3-8B-Stheno-v3.2`

I was clearly wrong when I said V2 would be difficult to improve on, because V3 is significantly better in just about every aspect. Stheno-v3.2 fixed all of the issues present in Stheno-v3.1, making it my favorite roleplay model and the best base model for llama-3 MoE merges.

The one thing I do want to improve on is finding a better conversational model than Meta-Llama-3-8B-Instruct; it's good for that use case, but I'm sure there's a better one out there. I tried using llama-3-cat-8b-instruct-v1, but it absolutely tanked the model's situational awareness and kept making blatantly contradictory statements.

# Quantization Formats
**GGUF**
- Static:
    - https://huggingface.co/mradermacher/Llama-Salad-4x8B-V3-GGUF
- Imatrix:
    - https://huggingface.co/mradermacher/Llama-Salad-4x8B-V3-i1-GGUF

# Details
- **License**: [llama3](https://llama.meta.com/llama3/license/)
- **Instruct Format**: [llama-3](https://llama.meta.com/docs/model-cards-and-prompt-formats/meta-llama-3/)
- **Context Size**: 8K

## Models Used
- [L3-8B-Stheno-v3.2](https://huggingface.co/Sao10K/L3-8B-Stheno-v3.2)
- [Meta-Llama-3-8B-Instruct](https://huggingface.co/NousResearch/Meta-Llama-3-8B-Instruct)
- [Llama-3-8B-Synthia-v3.5](https://huggingface.co/migtissera/Llama-3-8B-Synthia-v3.5)
- [Einstein-v6.1-Llama3-8B](https://huggingface.co/Weyaxi/Einstein-v6.1-Llama3-8B)

## Merge Config
```yaml
base_model: Sao10K/L3-8B-Stheno-v3.2
gate_mode: hidden
dtype: bfloat16
experts_per_token: 2
experts:
  - source_model: NousResearch/Meta-Llama-3-8B-Instruct
    positive_prompts:
    - "chat"
    - "conversation"
  - source_model: Weyaxi/Einstein-v6.1-Llama3-8B
    positive_prompts:
    - "science"
    - "physics"
    - "chemistry"
    - "biology"
    - "math"
    - "step-by-step"
    - "logical reasoning"
    - "multilingual"
    - "translation"
    - "language translation"
    - "foreign language"
    negative_prompts:
    - "programming language"
  - source_model: migtissera/Llama-3-8B-Synthia-v3.5
    positive_prompts:
    - "summarize"
    - "paraphrase"
    - "list"
    - "explain"
    - "define"
    - "analyze"
    - "rephrase"
    - "elaborate"
    - "programming language"
    - "JavaScript"
    - "Python programming language"
    - "Rust programming language"
    - "C++ programming language"
    - "GO programming language"
    - "Ruby programming language"
    - "Haskell programming language"
    - "SQL query language"
    - "CSS markup styling language"
    - "code"
  - source_model: Sao10K/L3-8B-Stheno-v3.2
    positive_prompts:
    - "characters"
    - "scene"
    - "roleplay"
    - "erotic roleplay"
    - "sexual fetish"
    - "NSFW"
    - "creative writing"
    - "storytelling"
    - "narration"
    - "narrative setting"
    - "narrative plot"
    - "narrative exposition"
    - "narrative theme"
    - "narrative climax"
```
# [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard)
Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_HiroseKoichi__Llama-Salad-4x8B-V3)

|      Metric       |Value|
|-------------------|----:|
|Avg.               |24.75|
|IFEval (0-Shot)    |66.54|
|BBH (3-Shot)       |31.93|
|MATH Lvl 5 (4-Shot)| 8.53|
|GPQA (0-shot)      | 7.05|
|MuSR (0-shot)      | 6.45|
|MMLU-PRO (5-shot)  |27.98|