|
2023-10-27 20:07:01,765 ---------------------------------------------------------------------------------------------------- |
|
2023-10-27 20:07:01,767 Model: "SequenceTagger( |
|
(embeddings): TransformerWordEmbeddings( |
|
(model): XLMRobertaModel( |
|
(embeddings): XLMRobertaEmbeddings( |
|
(word_embeddings): Embedding(250003, 1024) |
|
(position_embeddings): Embedding(514, 1024, padding_idx=1) |
|
(token_type_embeddings): Embedding(1, 1024) |
|
(LayerNorm): LayerNorm((1024,), eps=1e-05, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(encoder): XLMRobertaEncoder( |
|
(layer): ModuleList( |
|
(0-23): 24 x XLMRobertaLayer( |
|
(attention): XLMRobertaAttention( |
|
(self): XLMRobertaSelfAttention( |
|
(query): Linear(in_features=1024, out_features=1024, bias=True) |
|
(key): Linear(in_features=1024, out_features=1024, bias=True) |
|
(value): Linear(in_features=1024, out_features=1024, bias=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(output): XLMRobertaSelfOutput( |
|
(dense): Linear(in_features=1024, out_features=1024, bias=True) |
|
(LayerNorm): LayerNorm((1024,), eps=1e-05, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
(intermediate): XLMRobertaIntermediate( |
|
(dense): Linear(in_features=1024, out_features=4096, bias=True) |
|
(intermediate_act_fn): GELUActivation() |
|
) |
|
(output): XLMRobertaOutput( |
|
(dense): Linear(in_features=4096, out_features=1024, bias=True) |
|
(LayerNorm): LayerNorm((1024,), eps=1e-05, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
(pooler): XLMRobertaPooler( |
|
(dense): Linear(in_features=1024, out_features=1024, bias=True) |
|
(activation): Tanh() |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=1024, out_features=17, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-27 20:07:01,767 ---------------------------------------------------------------------------------------------------- |
|
2023-10-27 20:07:01,767 Corpus: 14903 train + 3449 dev + 3658 test sentences |
|
2023-10-27 20:07:01,767 ---------------------------------------------------------------------------------------------------- |
|
2023-10-27 20:07:01,767 Train: 14903 sentences |
|
2023-10-27 20:07:01,767 (train_with_dev=False, train_with_test=False) |
|
2023-10-27 20:07:01,767 ---------------------------------------------------------------------------------------------------- |
|
2023-10-27 20:07:01,767 Training Params: |
|
2023-10-27 20:07:01,767 - learning_rate: "5e-06" |
|
2023-10-27 20:07:01,767 - mini_batch_size: "4" |
|
2023-10-27 20:07:01,767 - max_epochs: "10" |
|
2023-10-27 20:07:01,767 - shuffle: "True" |
|
2023-10-27 20:07:01,767 ---------------------------------------------------------------------------------------------------- |
|
2023-10-27 20:07:01,768 Plugins: |
|
2023-10-27 20:07:01,768 - TensorboardLogger |
|
2023-10-27 20:07:01,768 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-27 20:07:01,768 ---------------------------------------------------------------------------------------------------- |
|
2023-10-27 20:07:01,768 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-27 20:07:01,768 - metric: "('micro avg', 'f1-score')" |
|
2023-10-27 20:07:01,768 ---------------------------------------------------------------------------------------------------- |
|
2023-10-27 20:07:01,768 Computation: |
|
2023-10-27 20:07:01,768 - compute on device: cuda:0 |
|
2023-10-27 20:07:01,768 - embedding storage: none |
|
2023-10-27 20:07:01,768 ---------------------------------------------------------------------------------------------------- |
|
2023-10-27 20:07:01,768 Model training base path: "flair-clean-conll-lr5e-06-bs4-5" |
|
2023-10-27 20:07:01,768 ---------------------------------------------------------------------------------------------------- |
|
2023-10-27 20:07:01,768 ---------------------------------------------------------------------------------------------------- |
|
2023-10-27 20:07:01,768 Logging anything other than scalars to TensorBoard is currently not supported. |
|
2023-10-27 20:07:47,370 epoch 1 - iter 372/3726 - loss 2.80593129 - time (sec): 45.60 - samples/sec: 438.05 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-27 20:08:32,644 epoch 1 - iter 744/3726 - loss 1.85094677 - time (sec): 90.87 - samples/sec: 444.69 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-27 20:09:17,954 epoch 1 - iter 1116/3726 - loss 1.42058830 - time (sec): 136.18 - samples/sec: 444.93 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-27 20:10:03,236 epoch 1 - iter 1488/3726 - loss 1.16393809 - time (sec): 181.47 - samples/sec: 444.38 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-27 20:10:48,524 epoch 1 - iter 1860/3726 - loss 0.98551923 - time (sec): 226.75 - samples/sec: 446.04 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-27 20:11:33,918 epoch 1 - iter 2232/3726 - loss 0.84790959 - time (sec): 272.15 - samples/sec: 450.27 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-27 20:12:19,596 epoch 1 - iter 2604/3726 - loss 0.74598188 - time (sec): 317.83 - samples/sec: 451.02 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-27 20:13:05,071 epoch 1 - iter 2976/3726 - loss 0.66540477 - time (sec): 363.30 - samples/sec: 451.42 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-27 20:13:50,701 epoch 1 - iter 3348/3726 - loss 0.60748783 - time (sec): 408.93 - samples/sec: 449.21 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-27 20:14:35,944 epoch 1 - iter 3720/3726 - loss 0.55800133 - time (sec): 454.17 - samples/sec: 449.55 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-27 20:14:36,662 ---------------------------------------------------------------------------------------------------- |
|
2023-10-27 20:14:36,663 EPOCH 1 done: loss 0.5571 - lr: 0.000005 |
|
2023-10-27 20:15:00,976 DEV : loss 0.08272701501846313 - f1-score (micro avg) 0.9305 |
|
2023-10-27 20:15:01,029 saving best model |
|
2023-10-27 20:15:02,837 ---------------------------------------------------------------------------------------------------- |
|
2023-10-27 20:15:49,496 epoch 2 - iter 372/3726 - loss 0.09075236 - time (sec): 46.66 - samples/sec: 446.77 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-27 20:16:35,366 epoch 2 - iter 744/3726 - loss 0.09661752 - time (sec): 92.53 - samples/sec: 440.88 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-27 20:17:20,750 epoch 2 - iter 1116/3726 - loss 0.09592189 - time (sec): 137.91 - samples/sec: 442.98 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-27 20:18:05,959 epoch 2 - iter 1488/3726 - loss 0.09322980 - time (sec): 183.12 - samples/sec: 443.25 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-27 20:18:51,330 epoch 2 - iter 1860/3726 - loss 0.08941354 - time (sec): 228.49 - samples/sec: 443.63 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-27 20:19:36,865 epoch 2 - iter 2232/3726 - loss 0.08782526 - time (sec): 274.03 - samples/sec: 444.12 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-27 20:20:21,760 epoch 2 - iter 2604/3726 - loss 0.08763755 - time (sec): 318.92 - samples/sec: 447.07 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-27 20:21:07,307 epoch 2 - iter 2976/3726 - loss 0.08459677 - time (sec): 364.47 - samples/sec: 449.80 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-27 20:21:52,612 epoch 2 - iter 3348/3726 - loss 0.08200724 - time (sec): 409.77 - samples/sec: 449.39 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-27 20:22:37,887 epoch 2 - iter 3720/3726 - loss 0.08090953 - time (sec): 455.05 - samples/sec: 448.96 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-27 20:22:38,602 ---------------------------------------------------------------------------------------------------- |
|
2023-10-27 20:22:38,603 EPOCH 2 done: loss 0.0808 - lr: 0.000004 |
|
2023-10-27 20:23:01,816 DEV : loss 0.0558977946639061 - f1-score (micro avg) 0.9643 |
|
2023-10-27 20:23:01,871 saving best model |
|
2023-10-27 20:23:04,562 ---------------------------------------------------------------------------------------------------- |
|
2023-10-27 20:23:50,075 epoch 3 - iter 372/3726 - loss 0.04925132 - time (sec): 45.51 - samples/sec: 436.29 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-27 20:24:35,505 epoch 3 - iter 744/3726 - loss 0.05096433 - time (sec): 90.94 - samples/sec: 441.79 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-27 20:25:21,376 epoch 3 - iter 1116/3726 - loss 0.05345821 - time (sec): 136.81 - samples/sec: 444.14 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-27 20:26:06,796 epoch 3 - iter 1488/3726 - loss 0.05364040 - time (sec): 182.23 - samples/sec: 444.47 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-27 20:26:52,221 epoch 3 - iter 1860/3726 - loss 0.05380637 - time (sec): 227.66 - samples/sec: 447.62 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-27 20:27:37,903 epoch 3 - iter 2232/3726 - loss 0.05332169 - time (sec): 273.34 - samples/sec: 448.66 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-27 20:28:24,443 epoch 3 - iter 2604/3726 - loss 0.05365144 - time (sec): 319.88 - samples/sec: 446.15 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-27 20:29:09,620 epoch 3 - iter 2976/3726 - loss 0.05262140 - time (sec): 365.06 - samples/sec: 447.36 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-27 20:29:55,368 epoch 3 - iter 3348/3726 - loss 0.05205947 - time (sec): 410.80 - samples/sec: 448.23 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-27 20:30:40,872 epoch 3 - iter 3720/3726 - loss 0.05122877 - time (sec): 456.31 - samples/sec: 447.79 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-27 20:30:41,552 ---------------------------------------------------------------------------------------------------- |
|
2023-10-27 20:30:41,553 EPOCH 3 done: loss 0.0512 - lr: 0.000004 |
|
2023-10-27 20:31:04,451 DEV : loss 0.04910625144839287 - f1-score (micro avg) 0.969 |
|
2023-10-27 20:31:04,505 saving best model |
|
2023-10-27 20:31:07,070 ---------------------------------------------------------------------------------------------------- |
|
2023-10-27 20:31:52,530 epoch 4 - iter 372/3726 - loss 0.03342383 - time (sec): 45.46 - samples/sec: 455.54 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-27 20:32:37,688 epoch 4 - iter 744/3726 - loss 0.03205166 - time (sec): 90.62 - samples/sec: 456.58 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-27 20:33:24,378 epoch 4 - iter 1116/3726 - loss 0.03357460 - time (sec): 137.31 - samples/sec: 451.87 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-27 20:34:10,145 epoch 4 - iter 1488/3726 - loss 0.03617981 - time (sec): 183.07 - samples/sec: 454.34 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-27 20:34:56,203 epoch 4 - iter 1860/3726 - loss 0.03561724 - time (sec): 229.13 - samples/sec: 450.05 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-27 20:35:42,349 epoch 4 - iter 2232/3726 - loss 0.03513961 - time (sec): 275.28 - samples/sec: 447.95 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-27 20:36:27,872 epoch 4 - iter 2604/3726 - loss 0.03560568 - time (sec): 320.80 - samples/sec: 446.14 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-27 20:37:13,373 epoch 4 - iter 2976/3726 - loss 0.03572121 - time (sec): 366.30 - samples/sec: 445.85 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-27 20:37:59,045 epoch 4 - iter 3348/3726 - loss 0.03483182 - time (sec): 411.97 - samples/sec: 446.11 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-27 20:38:44,300 epoch 4 - iter 3720/3726 - loss 0.03483957 - time (sec): 457.23 - samples/sec: 447.06 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-27 20:38:44,988 ---------------------------------------------------------------------------------------------------- |
|
2023-10-27 20:38:44,988 EPOCH 4 done: loss 0.0348 - lr: 0.000003 |
|
2023-10-27 20:39:07,891 DEV : loss 0.04652674123644829 - f1-score (micro avg) 0.9705 |
|
2023-10-27 20:39:07,943 saving best model |
|
2023-10-27 20:39:10,583 ---------------------------------------------------------------------------------------------------- |
|
2023-10-27 20:39:56,189 epoch 5 - iter 372/3726 - loss 0.03173966 - time (sec): 45.60 - samples/sec: 442.22 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-27 20:40:42,588 epoch 5 - iter 744/3726 - loss 0.03324355 - time (sec): 92.00 - samples/sec: 441.81 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-27 20:41:28,491 epoch 5 - iter 1116/3726 - loss 0.03176114 - time (sec): 137.91 - samples/sec: 446.35 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-27 20:42:14,724 epoch 5 - iter 1488/3726 - loss 0.02967819 - time (sec): 184.14 - samples/sec: 446.19 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-27 20:42:59,527 epoch 5 - iter 1860/3726 - loss 0.03079490 - time (sec): 228.94 - samples/sec: 446.65 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-27 20:43:44,886 epoch 5 - iter 2232/3726 - loss 0.02966149 - time (sec): 274.30 - samples/sec: 445.62 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-27 20:44:30,379 epoch 5 - iter 2604/3726 - loss 0.03065570 - time (sec): 319.79 - samples/sec: 446.65 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-27 20:45:15,785 epoch 5 - iter 2976/3726 - loss 0.03042225 - time (sec): 365.20 - samples/sec: 446.89 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-27 20:46:01,755 epoch 5 - iter 3348/3726 - loss 0.03020845 - time (sec): 411.17 - samples/sec: 446.33 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-27 20:46:47,177 epoch 5 - iter 3720/3726 - loss 0.02983678 - time (sec): 456.59 - samples/sec: 447.55 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-27 20:46:47,923 ---------------------------------------------------------------------------------------------------- |
|
2023-10-27 20:46:47,924 EPOCH 5 done: loss 0.0299 - lr: 0.000003 |
|
2023-10-27 20:47:10,884 DEV : loss 0.050089359283447266 - f1-score (micro avg) 0.9712 |
|
2023-10-27 20:47:10,938 saving best model |
|
2023-10-27 20:47:13,597 ---------------------------------------------------------------------------------------------------- |
|
2023-10-27 20:47:59,020 epoch 6 - iter 372/3726 - loss 0.02248010 - time (sec): 45.42 - samples/sec: 453.45 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-27 20:48:44,494 epoch 6 - iter 744/3726 - loss 0.01889729 - time (sec): 90.89 - samples/sec: 449.45 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-27 20:49:30,409 epoch 6 - iter 1116/3726 - loss 0.01902198 - time (sec): 136.81 - samples/sec: 446.61 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-27 20:50:16,297 epoch 6 - iter 1488/3726 - loss 0.02004643 - time (sec): 182.70 - samples/sec: 443.97 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-27 20:51:01,691 epoch 6 - iter 1860/3726 - loss 0.01983142 - time (sec): 228.09 - samples/sec: 444.93 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-27 20:51:47,098 epoch 6 - iter 2232/3726 - loss 0.02028738 - time (sec): 273.50 - samples/sec: 446.24 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-27 20:52:33,330 epoch 6 - iter 2604/3726 - loss 0.02002001 - time (sec): 319.73 - samples/sec: 446.18 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-27 20:53:19,605 epoch 6 - iter 2976/3726 - loss 0.02081204 - time (sec): 366.01 - samples/sec: 445.36 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-27 20:54:05,958 epoch 6 - iter 3348/3726 - loss 0.02032741 - time (sec): 412.36 - samples/sec: 445.83 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-27 20:54:52,761 epoch 6 - iter 3720/3726 - loss 0.02044719 - time (sec): 459.16 - samples/sec: 444.72 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-27 20:54:53,517 ---------------------------------------------------------------------------------------------------- |
|
2023-10-27 20:54:53,517 EPOCH 6 done: loss 0.0205 - lr: 0.000002 |
|
2023-10-27 20:55:17,099 DEV : loss 0.04764683172106743 - f1-score (micro avg) 0.9742 |
|
2023-10-27 20:55:17,154 saving best model |
|
2023-10-27 20:55:19,755 ---------------------------------------------------------------------------------------------------- |
|
2023-10-27 20:56:05,378 epoch 7 - iter 372/3726 - loss 0.02366929 - time (sec): 45.62 - samples/sec: 447.27 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-27 20:56:51,118 epoch 7 - iter 744/3726 - loss 0.02311710 - time (sec): 91.36 - samples/sec: 439.82 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-27 20:57:36,527 epoch 7 - iter 1116/3726 - loss 0.02129467 - time (sec): 136.77 - samples/sec: 445.39 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-27 20:58:21,642 epoch 7 - iter 1488/3726 - loss 0.02001426 - time (sec): 181.88 - samples/sec: 447.89 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-27 20:59:08,061 epoch 7 - iter 1860/3726 - loss 0.01894813 - time (sec): 228.30 - samples/sec: 445.16 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-27 20:59:53,051 epoch 7 - iter 2232/3726 - loss 0.01829151 - time (sec): 273.29 - samples/sec: 443.22 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-27 21:00:38,919 epoch 7 - iter 2604/3726 - loss 0.01783981 - time (sec): 319.16 - samples/sec: 442.58 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-27 21:01:26,075 epoch 7 - iter 2976/3726 - loss 0.01776618 - time (sec): 366.32 - samples/sec: 442.68 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-27 21:02:13,955 epoch 7 - iter 3348/3726 - loss 0.01772398 - time (sec): 414.20 - samples/sec: 442.24 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-27 21:03:01,418 epoch 7 - iter 3720/3726 - loss 0.01723102 - time (sec): 461.66 - samples/sec: 442.49 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-27 21:03:02,183 ---------------------------------------------------------------------------------------------------- |
|
2023-10-27 21:03:02,184 EPOCH 7 done: loss 0.0177 - lr: 0.000002 |
|
2023-10-27 21:03:26,361 DEV : loss 0.05419960245490074 - f1-score (micro avg) 0.9746 |
|
2023-10-27 21:03:26,416 saving best model |
|
2023-10-27 21:03:29,497 ---------------------------------------------------------------------------------------------------- |
|
2023-10-27 21:04:16,746 epoch 8 - iter 372/3726 - loss 0.01736122 - time (sec): 47.25 - samples/sec: 425.05 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-27 21:05:04,978 epoch 8 - iter 744/3726 - loss 0.01398385 - time (sec): 95.48 - samples/sec: 422.25 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-27 21:05:52,318 epoch 8 - iter 1116/3726 - loss 0.01274088 - time (sec): 142.82 - samples/sec: 424.69 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-27 21:06:39,260 epoch 8 - iter 1488/3726 - loss 0.01328050 - time (sec): 189.76 - samples/sec: 424.24 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-27 21:07:26,410 epoch 8 - iter 1860/3726 - loss 0.01227844 - time (sec): 236.91 - samples/sec: 427.47 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-27 21:08:13,172 epoch 8 - iter 2232/3726 - loss 0.01171643 - time (sec): 283.67 - samples/sec: 428.79 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-27 21:09:00,982 epoch 8 - iter 2604/3726 - loss 0.01235731 - time (sec): 331.48 - samples/sec: 428.53 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-27 21:09:51,061 epoch 8 - iter 2976/3726 - loss 0.01221098 - time (sec): 381.56 - samples/sec: 426.25 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-27 21:10:41,073 epoch 8 - iter 3348/3726 - loss 0.01227301 - time (sec): 431.57 - samples/sec: 426.32 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-27 21:11:30,225 epoch 8 - iter 3720/3726 - loss 0.01200497 - time (sec): 480.72 - samples/sec: 424.99 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-27 21:11:31,013 ---------------------------------------------------------------------------------------------------- |
|
2023-10-27 21:11:31,013 EPOCH 8 done: loss 0.0120 - lr: 0.000001 |
|
2023-10-27 21:11:56,731 DEV : loss 0.05550903454422951 - f1-score (micro avg) 0.9746 |
|
2023-10-27 21:11:56,806 ---------------------------------------------------------------------------------------------------- |
|
2023-10-27 21:12:46,264 epoch 9 - iter 372/3726 - loss 0.00563766 - time (sec): 49.46 - samples/sec: 405.90 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-27 21:13:36,129 epoch 9 - iter 744/3726 - loss 0.00454582 - time (sec): 99.32 - samples/sec: 411.65 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-27 21:14:26,227 epoch 9 - iter 1116/3726 - loss 0.00553718 - time (sec): 149.42 - samples/sec: 408.79 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-27 21:15:15,608 epoch 9 - iter 1488/3726 - loss 0.00675128 - time (sec): 198.80 - samples/sec: 409.19 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-27 21:16:05,355 epoch 9 - iter 1860/3726 - loss 0.00722006 - time (sec): 248.55 - samples/sec: 412.39 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-27 21:16:54,643 epoch 9 - iter 2232/3726 - loss 0.00736249 - time (sec): 297.83 - samples/sec: 411.62 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-27 21:17:45,323 epoch 9 - iter 2604/3726 - loss 0.00786494 - time (sec): 348.51 - samples/sec: 410.57 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-27 21:18:35,586 epoch 9 - iter 2976/3726 - loss 0.00784383 - time (sec): 398.78 - samples/sec: 410.83 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-27 21:19:25,103 epoch 9 - iter 3348/3726 - loss 0.00763218 - time (sec): 448.29 - samples/sec: 409.68 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-27 21:20:15,011 epoch 9 - iter 3720/3726 - loss 0.00729659 - time (sec): 498.20 - samples/sec: 409.86 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-27 21:20:15,793 ---------------------------------------------------------------------------------------------------- |
|
2023-10-27 21:20:15,793 EPOCH 9 done: loss 0.0073 - lr: 0.000001 |
|
2023-10-27 21:20:41,491 DEV : loss 0.056521423161029816 - f1-score (micro avg) 0.9737 |
|
2023-10-27 21:20:41,559 ---------------------------------------------------------------------------------------------------- |
|
2023-10-27 21:21:31,209 epoch 10 - iter 372/3726 - loss 0.00974983 - time (sec): 49.65 - samples/sec: 405.18 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-27 21:22:20,871 epoch 10 - iter 744/3726 - loss 0.00598632 - time (sec): 99.31 - samples/sec: 409.05 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-27 21:23:10,531 epoch 10 - iter 1116/3726 - loss 0.00650431 - time (sec): 148.97 - samples/sec: 415.14 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-27 21:24:00,035 epoch 10 - iter 1488/3726 - loss 0.00622288 - time (sec): 198.47 - samples/sec: 415.10 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-27 21:24:50,574 epoch 10 - iter 1860/3726 - loss 0.00650968 - time (sec): 249.01 - samples/sec: 412.96 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-27 21:25:40,130 epoch 10 - iter 2232/3726 - loss 0.00707901 - time (sec): 298.57 - samples/sec: 412.19 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-27 21:26:30,069 epoch 10 - iter 2604/3726 - loss 0.00707633 - time (sec): 348.51 - samples/sec: 408.38 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-27 21:27:20,314 epoch 10 - iter 2976/3726 - loss 0.00672126 - time (sec): 398.75 - samples/sec: 409.67 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-27 21:28:09,594 epoch 10 - iter 3348/3726 - loss 0.00659107 - time (sec): 448.03 - samples/sec: 408.65 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-27 21:28:58,954 epoch 10 - iter 3720/3726 - loss 0.00650167 - time (sec): 497.39 - samples/sec: 410.70 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-27 21:28:59,752 ---------------------------------------------------------------------------------------------------- |
|
2023-10-27 21:28:59,752 EPOCH 10 done: loss 0.0065 - lr: 0.000000 |
|
2023-10-27 21:29:25,442 DEV : loss 0.05730742961168289 - f1-score (micro avg) 0.9744 |
|
2023-10-27 21:29:28,531 ---------------------------------------------------------------------------------------------------- |
|
2023-10-27 21:29:28,534 Loading model from best epoch ... |
|
2023-10-27 21:29:38,713 SequenceTagger predicts: Dictionary with 17 tags: O, S-ORG, B-ORG, E-ORG, I-ORG, S-PER, B-PER, E-PER, I-PER, S-LOC, B-LOC, E-LOC, I-LOC, S-MISC, B-MISC, E-MISC, I-MISC |
|
2023-10-27 21:30:03,801 |
|
Results: |
|
- F-score (micro) 0.9699 |
|
- F-score (macro) 0.9647 |
|
- Accuracy 0.9567 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
ORG 0.9662 0.9738 0.9700 1909 |
|
PER 0.9956 0.9937 0.9947 1591 |
|
LOC 0.9701 0.9632 0.9666 1413 |
|
MISC 0.9170 0.9384 0.9276 812 |
|
|
|
micro avg 0.9682 0.9717 0.9699 5725 |
|
macro avg 0.9622 0.9673 0.9647 5725 |
|
weighted avg 0.9683 0.9717 0.9700 5725 |
|
|
|
2023-10-27 21:30:03,801 ---------------------------------------------------------------------------------------------------- |
|
|