DrNicefellow commited on
Commit
bd6acbf
1 Parent(s): 96e75d1

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +43 -0
README.md CHANGED
@@ -1,3 +1,46 @@
1
  ---
2
  license: apache-2.0
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: apache-2.0
3
  ---
4
+
5
+ # Mixtral-8x7B--v0.1: Model 7
6
+
7
+ ## Model Description
8
+
9
+ This model is the 7th extracted standalone model from the [mistralai/Mixtral-8x7B-v0.1](https://huggingface.co/mistralai/Mixtral-8x7B-v0.1), using the [Mixtral Model Expert Extractor tool](https://github.com/MeNicefellow/Mixtral-Model-Expert-Extractor) I made. It is constructed by selecting the first expert from each Mixture of Experts (MoE) layer. The extraction of this model is experimental. It is expected to be worse than Mistral-7B.
10
+
11
+ ## Model Architecture
12
+
13
+ The architecture of this model includes:
14
+ - Multi-head attention layers derived from the base Mixtral model.
15
+ - The first expert from each MoE layer, intended to provide a balanced approach to language understanding and generation tasks.
16
+ - Additional layers and components as required to ensure the model's functionality outside the MoE framework.
17
+
18
+
19
+ ### Example
20
+
21
+ ```python
22
+ from transformers import AutoModelForCausalLM, AutoTokenizer
23
+
24
+ model_name = "DrNicefellow/Mistral-3-from-Mixtral-8x7B-v0.1"
25
+ tokenizer = AutoTokenizer.from_pretrained(model_name)
26
+ model = AutoModelForCausalLM.from_pretrained(model_name)
27
+
28
+ text = "Today is a pleasant"
29
+ input_ids = tokenizer.encode(text, return_tensors='pt')
30
+ output = model.generate(input_ids)
31
+
32
+ print(tokenizer.decode(output[0], skip_special_tokens=True))
33
+ ```
34
+
35
+ ## License
36
+
37
+ This model is available under the Apache 2.0 License.
38
+
39
+
40
+ ## Discord Server
41
+
42
+ Join our Discord server [here](https://discord.gg/xhcBDEM3).
43
+
44
+ ## License
45
+
46
+ This model is open-sourced under the Apache 2.0 License. See the LICENSE file for more details.