🧿 new generation

Browse files

Files changed (5) hide show

README.md +30 -31
adapter_config.json +2 -2
adapter_model.safetensors +2 -2
runs/Apr25_20-35-42_2646f83db18e/events.out.tfevents.1714077351.2646f83db18e.1703.0 +3 -0
training_args.bin +2 -2

README.md CHANGED Viewed

@@ -5,11 +5,11 @@ tags:
 - trl
 - sft
 - generated_from_trainer
-base_model: cognitivecomputations/dolphin-2.2.1-mistral-7b
 datasets:
 - generator
 model-index:
-- name: mistral_instruct_generation
   results: []
 ---
@@ -19,42 +19,20 @@ should probably proofread and complete it, then remove this comment. -->
 # chain-texts-0.1-dolphin-mixtral-8x7b
 This model is a fine-tuned version of [cognitivecomputations/dolphin-2.2.1-mistral-7b](https://huggingface.co/cognitivecomputations/dolphin-2.2.1-mistral-7b) on the generator dataset.
 ## Model description
-- **Developed by:** Matt Owen
-- **Funded by [optional]:** Matt Owen
-- **Shared by [optional]:** Matt Owen
-- **Model type:** Sparse Mixture-of-Experts (SMoE)
-- **Language(s) (NLP):** English
-- **License:** The Unlicense
-- **Finetuned from model [optional]:** Dolphin 2.6 Mixtral 8x7b
-## Intended Uses & Limitations
-Easy your day-to-day workload by:
-* Generate horny chain text message threads for any holiday
-* Other things
-### Direct Use
-This can be used directly to make semi-spot-on, humorous, risqué chain text messages.
-### Out-of-Scope Use
-Do not use this model to send unsolicited, creepy messages.
-## Bias, Risks, and Limitations
-Source data was compiled from message boards, and - as a result - carries all the biases of anonymous internet users.
-### Recommendations
-Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
 ## Training and evaluation data
-Chain texts scraped from the world wide web.
 ## Training procedure
@@ -70,10 +48,31 @@ The following hyperparameters were used during training:
 - lr_scheduler_warmup_steps: 0.03
 - num_epochs: 3
 ### Framework versions
 - PEFT 0.10.0
 - Transformers 4.40.1
 - Pytorch 2.3.0+cu121
 - Datasets 2.19.0
-- Tokenizers 0.19.1

 - trl
 - sft
 - generated_from_trainer
 datasets:
 - generator
+base_model: cognitivecomputations/dolphin-2.2.1-mistral-7b
 model-index:
+- name: chain-texts-0.1-dolphin-mixtral-8x7b
   results: []
 ---
 # chain-texts-0.1-dolphin-mixtral-8x7b
 This model is a fine-tuned version of [cognitivecomputations/dolphin-2.2.1-mistral-7b](https://huggingface.co/cognitivecomputations/dolphin-2.2.1-mistral-7b) on the generator dataset.
+It achieves the following results on the evaluation set:
+- Loss: 1.6571
 ## Model description
+More information needed
+## Intended uses & limitations
+More information needed
 ## Training and evaluation data
+More information needed
 ## Training procedure
 - lr_scheduler_warmup_steps: 0.03
 - num_epochs: 3
+### Training results
+| Training Loss | Epoch  | Step | Validation Loss |
+|:-------------:|:------:|:----:|:---------------:|
+| 1.8452        | 0.1887 | 20   | 1.8520          |
+| 1.6519        | 0.3774 | 40   | 1.7660          |
+| 1.6726        | 0.5660 | 60   | 1.7475          |
+| 1.6545        | 0.7547 | 80   | 1.7325          |
+| 1.7688        | 0.9434 | 100  | 1.7146          |
+| 1.7037        | 1.1321 | 120  | 1.7112          |
+| 1.5269        | 1.3208 | 140  | 1.6965          |
+| 1.4638        | 1.5094 | 160  | 1.6875          |
+| 1.647         | 1.6981 | 180  | 1.6847          |
+| 1.5333        | 1.8868 | 200  | 1.6772          |
+| 1.5194        | 2.0755 | 220  | 1.6854          |
+| 1.5149        | 2.2642 | 240  | 1.6847          |
+| 1.3981        | 2.4528 | 260  | 1.6653          |
+| 1.4842        | 2.6415 | 280  | 1.6612          |
+| 1.4262        | 2.8302 | 300  | 1.6571          |
 ### Framework versions
 - PEFT 0.10.0
 - Transformers 4.40.1
 - Pytorch 2.3.0+cu121
 - Datasets 2.19.0
+- Tokenizers 0.19.1

adapter_config.json CHANGED Viewed

@@ -20,8 +20,8 @@
   "rank_pattern": {},
   "revision": null,
   "target_modules": [
-    "q_proj",
-    "v_proj"
   ],
   "task_type": "CAUSAL_LM",
   "use_dora": false,

   "rank_pattern": {},
   "revision": null,
   "target_modules": [
+    "v_proj",
+    "q_proj"
   ],
   "task_type": "CAUSAL_LM",
   "use_dora": false,

adapter_model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:3e93218612a35c41f3f154c7607a8f27c50bfcf507bc1c115991ba8d602604af
-size 109069176

 version https://git-lfs.github.com/spec/v1
+oid sha256:e44ce263e6fd885f50d82ca515b9325375b43ee36ededb75acf161ce88bc2e41
+size 48

runs/Apr25_20-35-42_2646f83db18e/events.out.tfevents.1714077351.2646f83db18e.1703.0 ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:28569b772aeee4b13b731b7cca5b9999add2c6eb3fa0debdfe4a5668e6c38085
+size 16119

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:b4de7e0bc9eb6478f2441b556f02b1149fd84c0ae240ec8ac1e66d6966fdb56f
-size 4984

 version https://git-lfs.github.com/spec/v1
+oid sha256:ac11adeed40f57e5d169dcfd5f36f3008d186b906d481ae7e38aa9fc40cbe857
+size 5048