tf-tpu/roberta-base-epochs-100

This model is a fine-tuned version of roberta-base on an unknown dataset. It achieves the following results on the evaluation set:

Train Loss: 1.0414
Train Accuracy: 0.1136
Validation Loss: 1.0103
Validation Accuracy: 0.1144
Epoch: 99

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

optimizer: {'name': 'AdamWeightDecay', 'learning_rate': {'class_name': 'WarmUp', 'config': {'initial_learning_rate': 0.0001, 'decay_schedule_fn': {'class_name': 'PolynomialDecay', 'config': {'initial_learning_rate': 0.0001, 'decay_steps': 55765, 'end_learning_rate': 0.0, 'power': 1.0, 'cycle': False, 'name': None}, 'passive_serialization': True}, 'warmup_steps': 2935, 'power': 1.0, 'name': None}}, 'decay': 0.0, 'beta_1': 0.9, 'beta_2': 0.999, 'epsilon': 1e-08, 'amsgrad': False, 'weight_decay_rate': 0.001}
training_precision: mixed_bfloat16

Training results

Train Loss	Train Accuracy	Validation Loss	Validation Accuracy	Epoch
7.2121	0.0274	5.7188	0.0346	0
5.4335	0.0414	5.2266	0.0439	1
5.1579	0.0445	5.0625	0.0441	2
5.0231	0.0447	4.9453	0.0446	3
4.9323	0.0448	4.8633	0.0443	4
4.8672	0.0449	4.8789	0.0440	5
4.8200	0.0449	4.8164	0.0441	6
4.7841	0.0449	4.7734	0.0450	7
4.7546	0.0449	4.7539	0.0441	8
4.7288	0.0449	4.7305	0.0447	9
4.7084	0.0449	4.7422	0.0443	10
4.6884	0.0450	4.7148	0.0437	11
4.6764	0.0449	4.7070	0.0441	12
4.6637	0.0449	4.7227	0.0435	13
4.5963	0.0449	4.5195	0.0444	14
4.3462	0.0468	4.0742	0.0515	15
3.4139	0.0650	2.6348	0.0797	16
2.5336	0.0817	2.1816	0.0888	17
2.1859	0.0888	1.9648	0.0930	18
2.0043	0.0925	1.8154	0.0961	19
1.8887	0.0948	1.7129	0.0993	20
1.8058	0.0965	1.6729	0.0996	21
1.7402	0.0979	1.6191	0.1010	22
1.6861	0.0990	1.5693	0.1024	23
1.6327	0.1001	1.5273	0.1035	24
1.5906	0.1010	1.4766	0.1042	25
1.5545	0.1018	1.4561	0.1031	26
1.5231	0.1024	1.4365	0.1054	27
1.4957	0.1030	1.3975	0.1046	28
1.4700	0.1036	1.3789	0.1061	29
1.4466	0.1041	1.3262	0.1070	30
1.4253	0.1046	1.3223	0.1072	31
1.4059	0.1050	1.3096	0.1070	32
1.3873	0.1054	1.3164	0.1072	33
1.3703	0.1058	1.2861	0.1072	34
1.3550	0.1062	1.2705	0.1082	35
1.3398	0.1065	1.2578	0.1082	36
1.3260	0.1068	1.25	0.1096	37
1.3127	0.1071	1.2266	0.1102	38
1.2996	0.1074	1.2305	0.1098	39
1.2891	0.1077	1.2139	0.1088	40
1.2783	0.1079	1.2158	0.1093	41
1.2674	0.1081	1.1787	0.1114	42
1.2570	0.1084	1.1709	0.1107	43
1.2478	0.1086	1.1709	0.1104	44
1.2390	0.1088	1.1777	0.1101	45
1.2305	0.1090	1.1738	0.1111	46
1.2215	0.1092	1.1533	0.1112	47
1.2140	0.1094	1.1514	0.1117	48
1.2068	0.1096	1.1621	0.1119	49
1.1991	0.1097	1.1416	0.1108	50
1.1927	0.1099	1.1279	0.1113	51
1.1854	0.1101	1.1147	0.1123	52
1.1800	0.1102	1.125	0.1116	53
1.1727	0.1104	1.1167	0.1116	54
1.1679	0.1105	1.0884	0.1122	55
1.1613	0.1106	1.1084	0.1120	56
1.1563	0.1107	1.1035	0.1119	57
1.1517	0.1109	1.1035	0.1124	58
1.1454	0.1111	1.0718	0.1128	59
1.1403	0.1111	1.0874	0.1123	60
1.1360	0.1112	1.0742	0.1145	61
1.1318	0.1114	1.0811	0.1131	62
1.1277	0.1114	1.0723	0.1129	63
1.1226	0.1116	1.0640	0.1124	64
1.1186	0.1117	1.0840	0.1117	65
1.1144	0.1118	1.0522	0.1139	66
1.1111	0.1119	1.0557	0.1132	67
1.1069	0.1119	1.0718	0.1124	68
1.1038	0.1120	1.0376	0.1135	69
1.1007	0.1121	1.0537	0.1138	70
1.0975	0.1121	1.0503	0.1134	71
1.0941	0.1122	1.0317	0.1140	72
1.0902	0.1124	1.0439	0.1145	73
1.0881	0.1124	1.0352	0.1145	74
1.0839	0.1125	1.0449	0.1144	75
1.0821	0.1125	1.0229	0.1148	76
1.0791	0.1126	1.0244	0.1148	77
1.0764	0.1127	1.0366	0.1141	78
1.0741	0.1128	1.0308	0.1134	79
1.0716	0.1128	1.0400	0.1137	80
1.0688	0.1129	1.0225	0.1140	81
1.0664	0.1129	1.0269	0.1139	82
1.0643	0.1129	1.0156	0.1146	83
1.0629	0.1131	1.0127	0.1149	84
1.0602	0.1131	1.0420	0.1132	85
1.0580	0.1132	1.0205	0.1149	86
1.0568	0.1132	1.0024	0.1159	87
1.0547	0.1132	1.0210	0.1144	88
1.0536	0.1133	1.0176	0.1143	89
1.0522	0.1133	0.9951	0.1134	90
1.0505	0.1134	1.0283	0.1136	91
1.0484	0.1134	1.0063	0.1141	92
1.0482	0.1134	0.9917	0.1141	93
1.0463	0.1135	1.0244	0.1145	94
1.0458	0.1134	1.0220	0.1143	95
1.0448	0.1135	0.9785	0.1147	96
1.0435	0.1135	0.9771	0.1155	97
1.0433	0.1135	0.9946	0.1137	98
1.0414	0.1136	1.0103	0.1144	99

Framework versions

Transformers 4.27.0.dev0
TensorFlow 2.9.1
Tokenizers 0.13.2

tf-tpu
/

roberta-base-epochs-100

tf-tpu/roberta-base-epochs-100

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Evaluation results