vuiseng9 commited on
Commit
68eebd9
1 Parent(s): 933d737

update model card README.md

Browse files
Files changed (1) hide show
  1. README.md +90 -0
README.md ADDED
@@ -0,0 +1,90 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ tags:
3
+ - generated_from_trainer
4
+ datasets:
5
+ - glue
6
+ metrics:
7
+ - accuracy
8
+ - f1
9
+ model-index:
10
+ - name: baseline-ft-mrpc-IRoberta-b-8bit
11
+ results:
12
+ - task:
13
+ name: Text Classification
14
+ type: text-classification
15
+ dataset:
16
+ name: glue
17
+ type: glue
18
+ config: mrpc
19
+ split: validation
20
+ args: mrpc
21
+ metrics:
22
+ - name: Accuracy
23
+ type: accuracy
24
+ value: 0.8970588235294118
25
+ - name: F1
26
+ type: f1
27
+ value: 0.9257950530035336
28
+ ---
29
+
30
+ <!-- This model card has been generated automatically according to the information the Trainer had access to. You
31
+ should probably proofread and complete it, then remove this comment. -->
32
+
33
+ # baseline-ft-mrpc-IRoberta-b-8bit
34
+
35
+ This model is a fine-tuned version of [vuiseng9/baseline-ft-mrpc-IRoberta-b-unquantized](https://huggingface.co/vuiseng9/baseline-ft-mrpc-IRoberta-b-unquantized) on the glue dataset.
36
+ It achieves the following results on the evaluation set:
37
+ - Loss: 0.3871
38
+ - Accuracy: 0.8971
39
+ - F1: 0.9258
40
+ - Combined Score: 0.9114
41
+
42
+ ## Model description
43
+
44
+ More information needed
45
+
46
+ ## Intended uses & limitations
47
+
48
+ More information needed
49
+
50
+ ## Training and evaluation data
51
+
52
+ More information needed
53
+
54
+ ## Training procedure
55
+
56
+ ### Training hyperparameters
57
+
58
+ The following hyperparameters were used during training:
59
+ - learning_rate: 5e-07
60
+ - train_batch_size: 16
61
+ - eval_batch_size: 16
62
+ - seed: 42
63
+ - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
64
+ - lr_scheduler_type: linear
65
+ - num_epochs: 12.0
66
+
67
+ ### Training results
68
+
69
+ | Training Loss | Epoch | Step | Validation Loss | Accuracy | F1 | Combined Score |
70
+ |:-------------:|:-----:|:----:|:---------------:|:--------:|:------:|:--------------:|
71
+ | 0.0021 | 1.0 | 230 | 0.4017 | 0.8848 | 0.9147 | 0.8998 |
72
+ | 0.0026 | 2.0 | 460 | 0.4105 | 0.8873 | 0.9173 | 0.9023 |
73
+ | 0.0026 | 3.0 | 690 | 0.3707 | 0.8946 | 0.9236 | 0.9091 |
74
+ | 0.0037 | 4.0 | 920 | 0.3893 | 0.8946 | 0.9228 | 0.9087 |
75
+ | 1.324 | 5.0 | 1150 | 0.3871 | 0.8897 | 0.9204 | 0.9050 |
76
+ | 0.0227 | 6.0 | 1380 | 0.3951 | 0.8897 | 0.9201 | 0.9049 |
77
+ | 0.0081 | 7.0 | 1610 | 0.3818 | 0.8824 | 0.9155 | 0.8989 |
78
+ | 0.0054 | 8.0 | 1840 | 0.3902 | 0.8873 | 0.9181 | 0.9027 |
79
+ | 0.0383 | 9.0 | 2070 | 0.3659 | 0.8922 | 0.9225 | 0.9073 |
80
+ | 0.3861 | 10.0 | 2300 | 0.4260 | 0.8652 | 0.9030 | 0.8841 |
81
+ | 0.0028 | 11.0 | 2530 | 0.3619 | 0.8946 | 0.9234 | 0.9090 |
82
+ | 0.0957 | 12.0 | 2760 | 0.3871 | 0.8971 | 0.9258 | 0.9114 |
83
+
84
+
85
+ ### Framework versions
86
+
87
+ - Transformers 4.30.2
88
+ - Pytorch 2.0.1+cu118
89
+ - Datasets 2.11.0
90
+ - Tokenizers 0.13.3