HiroseKoichi
/

Llama-Salad-4x8B-V3

Text Generation

nsfw

Not-For-All-Audiences

text-generation-inference

Mixture of Experts

Inference Endpoints

Model card Files Files and versions Community

HiroseKoichi commited on Jun 18

Commit

470c09b

•

1 Parent(s): 6aac349

Update README.md

Files changed (1) hide show

README.md +4 -0

README.md CHANGED Viewed

@@ -17,6 +17,10 @@ Changes in V3:
 - Removed `opus-v1.2-llama-3-8b-instruct-run3.5-epoch2.5` and added `Einstein-v6.1-Llama3-8B`
 - Swapped `Llama-3-Soliloquy-8B-v2` for `L3-8B-Stheno-v3.2`
 # Details
 - **License**: [llama3](https://llama.meta.com/llama3/license/)
 - **Instruct Format**: [llama-3](https://llama.meta.com/docs/model-cards-and-prompt-formats/meta-llama-3/)

 - Removed `opus-v1.2-llama-3-8b-instruct-run3.5-epoch2.5` and added `Einstein-v6.1-Llama3-8B`
 - Swapped `Llama-3-Soliloquy-8B-v2` for `L3-8B-Stheno-v3.2`
+I was clearly wrong when I said V2 would be difficult to improve on, because V3 is significantly better in just about every aspect. Stheno-v3.2 fixed all of the issues present in Stheno-v3.1, making it my favorite roleplay model and the best base model for llama-3 MoE merges.
+The one thing I do want to improve on is finding a better conversational model than Meta-Llama-3-8B-Instruct; it's good for that use case, but I'm sure there's a better one out there. I tried using llama-3-cat-8b-instruct-v1, but it absolutely tanked the model's situational awareness and kept making blatantly contradictory statements.
 # Details
 - **License**: [llama3](https://llama.meta.com/llama3/license/)
 - **Instruct Format**: [llama-3](https://llama.meta.com/docs/model-cards-and-prompt-formats/meta-llama-3/)