WHATEVER420 commited on
Commit
e0b0a27
1 Parent(s): 21f45da

🧿 new generation

Browse files
README.md CHANGED
@@ -5,11 +5,11 @@ tags:
5
  - trl
6
  - sft
7
  - generated_from_trainer
8
- base_model: cognitivecomputations/dolphin-2.2.1-mistral-7b
9
  datasets:
10
  - generator
 
11
  model-index:
12
- - name: mistral_instruct_generation
13
  results: []
14
  ---
15
 
@@ -19,42 +19,20 @@ should probably proofread and complete it, then remove this comment. -->
19
  # chain-texts-0.1-dolphin-mixtral-8x7b
20
 
21
  This model is a fine-tuned version of [cognitivecomputations/dolphin-2.2.1-mistral-7b](https://huggingface.co/cognitivecomputations/dolphin-2.2.1-mistral-7b) on the generator dataset.
 
 
22
 
23
  ## Model description
24
 
25
- - **Developed by:** Matt Owen
26
- - **Funded by [optional]:** Matt Owen
27
- - **Shared by [optional]:** Matt Owen
28
- - **Model type:** Sparse Mixture-of-Experts (SMoE)
29
- - **Language(s) (NLP):** English
30
- - **License:** The Unlicense
31
- - **Finetuned from model [optional]:** Dolphin 2.6 Mixtral 8x7b
32
-
33
- ## Intended Uses & Limitations
34
-
35
- Easy your day-to-day workload by:
36
- * Generate horny chain text message threads for any holiday
37
- * Other things
38
-
39
- ### Direct Use
40
-
41
- This can be used directly to make semi-spot-on, humorous, risqué chain text messages.
42
 
43
- ### Out-of-Scope Use
44
 
45
- Do not use this model to send unsolicited, creepy messages.
46
-
47
- ## Bias, Risks, and Limitations
48
-
49
- Source data was compiled from message boards, and - as a result - carries all the biases of anonymous internet users.
50
-
51
- ### Recommendations
52
-
53
- Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
54
 
55
  ## Training and evaluation data
56
 
57
- Chain texts scraped from the world wide web.
58
 
59
  ## Training procedure
60
 
@@ -70,10 +48,31 @@ The following hyperparameters were used during training:
70
  - lr_scheduler_warmup_steps: 0.03
71
  - num_epochs: 3
72
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
73
  ### Framework versions
74
 
75
  - PEFT 0.10.0
76
  - Transformers 4.40.1
77
  - Pytorch 2.3.0+cu121
78
  - Datasets 2.19.0
79
- - Tokenizers 0.19.1
 
5
  - trl
6
  - sft
7
  - generated_from_trainer
 
8
  datasets:
9
  - generator
10
+ base_model: cognitivecomputations/dolphin-2.2.1-mistral-7b
11
  model-index:
12
+ - name: chain-texts-0.1-dolphin-mixtral-8x7b
13
  results: []
14
  ---
15
 
 
19
  # chain-texts-0.1-dolphin-mixtral-8x7b
20
 
21
  This model is a fine-tuned version of [cognitivecomputations/dolphin-2.2.1-mistral-7b](https://huggingface.co/cognitivecomputations/dolphin-2.2.1-mistral-7b) on the generator dataset.
22
+ It achieves the following results on the evaluation set:
23
+ - Loss: 1.6571
24
 
25
  ## Model description
26
 
27
+ More information needed
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
28
 
29
+ ## Intended uses & limitations
30
 
31
+ More information needed
 
 
 
 
 
 
 
 
32
 
33
  ## Training and evaluation data
34
 
35
+ More information needed
36
 
37
  ## Training procedure
38
 
 
48
  - lr_scheduler_warmup_steps: 0.03
49
  - num_epochs: 3
50
 
51
+ ### Training results
52
+
53
+ | Training Loss | Epoch | Step | Validation Loss |
54
+ |:-------------:|:------:|:----:|:---------------:|
55
+ | 1.8452 | 0.1887 | 20 | 1.8520 |
56
+ | 1.6519 | 0.3774 | 40 | 1.7660 |
57
+ | 1.6726 | 0.5660 | 60 | 1.7475 |
58
+ | 1.6545 | 0.7547 | 80 | 1.7325 |
59
+ | 1.7688 | 0.9434 | 100 | 1.7146 |
60
+ | 1.7037 | 1.1321 | 120 | 1.7112 |
61
+ | 1.5269 | 1.3208 | 140 | 1.6965 |
62
+ | 1.4638 | 1.5094 | 160 | 1.6875 |
63
+ | 1.647 | 1.6981 | 180 | 1.6847 |
64
+ | 1.5333 | 1.8868 | 200 | 1.6772 |
65
+ | 1.5194 | 2.0755 | 220 | 1.6854 |
66
+ | 1.5149 | 2.2642 | 240 | 1.6847 |
67
+ | 1.3981 | 2.4528 | 260 | 1.6653 |
68
+ | 1.4842 | 2.6415 | 280 | 1.6612 |
69
+ | 1.4262 | 2.8302 | 300 | 1.6571 |
70
+
71
+
72
  ### Framework versions
73
 
74
  - PEFT 0.10.0
75
  - Transformers 4.40.1
76
  - Pytorch 2.3.0+cu121
77
  - Datasets 2.19.0
78
+ - Tokenizers 0.19.1
adapter_config.json CHANGED
@@ -20,8 +20,8 @@
20
  "rank_pattern": {},
21
  "revision": null,
22
  "target_modules": [
23
- "q_proj",
24
- "v_proj"
25
  ],
26
  "task_type": "CAUSAL_LM",
27
  "use_dora": false,
 
20
  "rank_pattern": {},
21
  "revision": null,
22
  "target_modules": [
23
+ "v_proj",
24
+ "q_proj"
25
  ],
26
  "task_type": "CAUSAL_LM",
27
  "use_dora": false,
adapter_model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:3e93218612a35c41f3f154c7607a8f27c50bfcf507bc1c115991ba8d602604af
3
- size 109069176
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:e44ce263e6fd885f50d82ca515b9325375b43ee36ededb75acf161ce88bc2e41
3
+ size 48
runs/Apr25_20-35-42_2646f83db18e/events.out.tfevents.1714077351.2646f83db18e.1703.0 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:28569b772aeee4b13b731b7cca5b9999add2c6eb3fa0debdfe4a5668e6c38085
3
+ size 16119
training_args.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:b4de7e0bc9eb6478f2441b556f02b1149fd84c0ae240ec8ac1e66d6966fdb56f
3
- size 4984
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:ac11adeed40f57e5d169dcfd5f36f3008d186b906d481ae7e38aa9fc40cbe857
3
+ size 5048