stefan-it's picture
Upload folder using huggingface_hub
4068b55
2023-10-18 14:42:01,028 ----------------------------------------------------------------------------------------------------
2023-10-18 14:42:01,028 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(32001, 128)
(position_embeddings): Embedding(512, 128)
(token_type_embeddings): Embedding(2, 128)
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-1): 2 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=128, out_features=128, bias=True)
(key): Linear(in_features=128, out_features=128, bias=True)
(value): Linear(in_features=128, out_features=128, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=128, out_features=128, bias=True)
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=128, out_features=512, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=512, out_features=128, bias=True)
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=128, out_features=128, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=128, out_features=25, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-18 14:42:01,028 ----------------------------------------------------------------------------------------------------
2023-10-18 14:42:01,028 MultiCorpus: 1100 train + 206 dev + 240 test sentences
- NER_HIPE_2022 Corpus: 1100 train + 206 dev + 240 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/ajmc/de/with_doc_seperator
2023-10-18 14:42:01,028 ----------------------------------------------------------------------------------------------------
2023-10-18 14:42:01,028 Train: 1100 sentences
2023-10-18 14:42:01,028 (train_with_dev=False, train_with_test=False)
2023-10-18 14:42:01,028 ----------------------------------------------------------------------------------------------------
2023-10-18 14:42:01,029 Training Params:
2023-10-18 14:42:01,029 - learning_rate: "5e-05"
2023-10-18 14:42:01,029 - mini_batch_size: "8"
2023-10-18 14:42:01,029 - max_epochs: "10"
2023-10-18 14:42:01,029 - shuffle: "True"
2023-10-18 14:42:01,029 ----------------------------------------------------------------------------------------------------
2023-10-18 14:42:01,029 Plugins:
2023-10-18 14:42:01,029 - TensorboardLogger
2023-10-18 14:42:01,029 - LinearScheduler | warmup_fraction: '0.1'
2023-10-18 14:42:01,029 ----------------------------------------------------------------------------------------------------
2023-10-18 14:42:01,029 Final evaluation on model from best epoch (best-model.pt)
2023-10-18 14:42:01,029 - metric: "('micro avg', 'f1-score')"
2023-10-18 14:42:01,029 ----------------------------------------------------------------------------------------------------
2023-10-18 14:42:01,029 Computation:
2023-10-18 14:42:01,029 - compute on device: cuda:0
2023-10-18 14:42:01,029 - embedding storage: none
2023-10-18 14:42:01,029 ----------------------------------------------------------------------------------------------------
2023-10-18 14:42:01,029 Model training base path: "hmbench-ajmc/de-dbmdz/bert-tiny-historic-multilingual-cased-bs8-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-3"
2023-10-18 14:42:01,029 ----------------------------------------------------------------------------------------------------
2023-10-18 14:42:01,029 ----------------------------------------------------------------------------------------------------
2023-10-18 14:42:01,029 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-18 14:42:01,239 epoch 1 - iter 13/138 - loss 3.73569496 - time (sec): 0.21 - samples/sec: 10361.94 - lr: 0.000004 - momentum: 0.000000
2023-10-18 14:42:01,465 epoch 1 - iter 26/138 - loss 3.73195651 - time (sec): 0.44 - samples/sec: 9826.32 - lr: 0.000009 - momentum: 0.000000
2023-10-18 14:42:01,734 epoch 1 - iter 39/138 - loss 3.69492891 - time (sec): 0.70 - samples/sec: 9302.97 - lr: 0.000014 - momentum: 0.000000
2023-10-18 14:42:02,006 epoch 1 - iter 52/138 - loss 3.60879931 - time (sec): 0.98 - samples/sec: 8910.98 - lr: 0.000018 - momentum: 0.000000
2023-10-18 14:42:02,289 epoch 1 - iter 65/138 - loss 3.47507464 - time (sec): 1.26 - samples/sec: 8423.86 - lr: 0.000023 - momentum: 0.000000
2023-10-18 14:42:02,571 epoch 1 - iter 78/138 - loss 3.29658606 - time (sec): 1.54 - samples/sec: 8245.63 - lr: 0.000028 - momentum: 0.000000
2023-10-18 14:42:02,843 epoch 1 - iter 91/138 - loss 3.11792119 - time (sec): 1.81 - samples/sec: 8135.44 - lr: 0.000033 - momentum: 0.000000
2023-10-18 14:42:03,122 epoch 1 - iter 104/138 - loss 2.90670860 - time (sec): 2.09 - samples/sec: 8015.58 - lr: 0.000037 - momentum: 0.000000
2023-10-18 14:42:03,422 epoch 1 - iter 117/138 - loss 2.69736692 - time (sec): 2.39 - samples/sec: 8018.68 - lr: 0.000042 - momentum: 0.000000
2023-10-18 14:42:03,721 epoch 1 - iter 130/138 - loss 2.51556201 - time (sec): 2.69 - samples/sec: 8002.22 - lr: 0.000047 - momentum: 0.000000
2023-10-18 14:42:03,897 ----------------------------------------------------------------------------------------------------
2023-10-18 14:42:03,897 EPOCH 1 done: loss 2.4383 - lr: 0.000047
2023-10-18 14:42:04,148 DEV : loss 0.9075518846511841 - f1-score (micro avg) 0.0
2023-10-18 14:42:04,152 ----------------------------------------------------------------------------------------------------
2023-10-18 14:42:04,460 epoch 2 - iter 13/138 - loss 1.01133132 - time (sec): 0.31 - samples/sec: 7736.84 - lr: 0.000050 - momentum: 0.000000
2023-10-18 14:42:04,763 epoch 2 - iter 26/138 - loss 0.95797444 - time (sec): 0.61 - samples/sec: 7529.08 - lr: 0.000049 - momentum: 0.000000
2023-10-18 14:42:05,055 epoch 2 - iter 39/138 - loss 0.91931902 - time (sec): 0.90 - samples/sec: 7505.98 - lr: 0.000048 - momentum: 0.000000
2023-10-18 14:42:05,341 epoch 2 - iter 52/138 - loss 0.90824686 - time (sec): 1.19 - samples/sec: 7560.72 - lr: 0.000048 - momentum: 0.000000
2023-10-18 14:42:05,636 epoch 2 - iter 65/138 - loss 0.90479756 - time (sec): 1.48 - samples/sec: 7513.75 - lr: 0.000047 - momentum: 0.000000
2023-10-18 14:42:05,927 epoch 2 - iter 78/138 - loss 0.90606543 - time (sec): 1.77 - samples/sec: 7425.93 - lr: 0.000047 - momentum: 0.000000
2023-10-18 14:42:06,227 epoch 2 - iter 91/138 - loss 0.89109830 - time (sec): 2.07 - samples/sec: 7315.63 - lr: 0.000046 - momentum: 0.000000
2023-10-18 14:42:06,516 epoch 2 - iter 104/138 - loss 0.88671218 - time (sec): 2.36 - samples/sec: 7292.12 - lr: 0.000046 - momentum: 0.000000
2023-10-18 14:42:06,821 epoch 2 - iter 117/138 - loss 0.87207619 - time (sec): 2.67 - samples/sec: 7238.99 - lr: 0.000045 - momentum: 0.000000
2023-10-18 14:42:07,129 epoch 2 - iter 130/138 - loss 0.87102374 - time (sec): 2.98 - samples/sec: 7289.56 - lr: 0.000045 - momentum: 0.000000
2023-10-18 14:42:07,307 ----------------------------------------------------------------------------------------------------
2023-10-18 14:42:07,307 EPOCH 2 done: loss 0.8591 - lr: 0.000045
2023-10-18 14:42:07,668 DEV : loss 0.6102067232131958 - f1-score (micro avg) 0.0719
2023-10-18 14:42:07,674 saving best model
2023-10-18 14:42:07,709 ----------------------------------------------------------------------------------------------------
2023-10-18 14:42:07,993 epoch 3 - iter 13/138 - loss 0.67856660 - time (sec): 0.28 - samples/sec: 7328.62 - lr: 0.000044 - momentum: 0.000000
2023-10-18 14:42:08,275 epoch 3 - iter 26/138 - loss 0.68905023 - time (sec): 0.57 - samples/sec: 7364.80 - lr: 0.000043 - momentum: 0.000000
2023-10-18 14:42:08,551 epoch 3 - iter 39/138 - loss 0.64678780 - time (sec): 0.84 - samples/sec: 7454.57 - lr: 0.000043 - momentum: 0.000000
2023-10-18 14:42:08,836 epoch 3 - iter 52/138 - loss 0.66206237 - time (sec): 1.13 - samples/sec: 7612.32 - lr: 0.000042 - momentum: 0.000000
2023-10-18 14:42:09,134 epoch 3 - iter 65/138 - loss 0.66470944 - time (sec): 1.42 - samples/sec: 7714.43 - lr: 0.000042 - momentum: 0.000000
2023-10-18 14:42:09,436 epoch 3 - iter 78/138 - loss 0.66818738 - time (sec): 1.73 - samples/sec: 7604.34 - lr: 0.000041 - momentum: 0.000000
2023-10-18 14:42:09,735 epoch 3 - iter 91/138 - loss 0.65668679 - time (sec): 2.02 - samples/sec: 7520.16 - lr: 0.000041 - momentum: 0.000000
2023-10-18 14:42:10,029 epoch 3 - iter 104/138 - loss 0.64849849 - time (sec): 2.32 - samples/sec: 7503.79 - lr: 0.000040 - momentum: 0.000000
2023-10-18 14:42:10,329 epoch 3 - iter 117/138 - loss 0.64409328 - time (sec): 2.62 - samples/sec: 7428.91 - lr: 0.000040 - momentum: 0.000000
2023-10-18 14:42:10,632 epoch 3 - iter 130/138 - loss 0.63758900 - time (sec): 2.92 - samples/sec: 7340.63 - lr: 0.000039 - momentum: 0.000000
2023-10-18 14:42:10,801 ----------------------------------------------------------------------------------------------------
2023-10-18 14:42:10,801 EPOCH 3 done: loss 0.6399 - lr: 0.000039
2023-10-18 14:42:11,293 DEV : loss 0.4903678894042969 - f1-score (micro avg) 0.2797
2023-10-18 14:42:11,298 saving best model
2023-10-18 14:42:11,332 ----------------------------------------------------------------------------------------------------
2023-10-18 14:42:11,620 epoch 4 - iter 13/138 - loss 0.53403819 - time (sec): 0.29 - samples/sec: 6971.07 - lr: 0.000038 - momentum: 0.000000
2023-10-18 14:42:11,915 epoch 4 - iter 26/138 - loss 0.59399038 - time (sec): 0.58 - samples/sec: 7011.50 - lr: 0.000038 - momentum: 0.000000
2023-10-18 14:42:12,213 epoch 4 - iter 39/138 - loss 0.56104521 - time (sec): 0.88 - samples/sec: 7181.09 - lr: 0.000037 - momentum: 0.000000
2023-10-18 14:42:12,494 epoch 4 - iter 52/138 - loss 0.55720001 - time (sec): 1.16 - samples/sec: 7170.40 - lr: 0.000037 - momentum: 0.000000
2023-10-18 14:42:12,782 epoch 4 - iter 65/138 - loss 0.55837560 - time (sec): 1.45 - samples/sec: 7173.13 - lr: 0.000036 - momentum: 0.000000
2023-10-18 14:42:13,073 epoch 4 - iter 78/138 - loss 0.57073749 - time (sec): 1.74 - samples/sec: 7308.80 - lr: 0.000036 - momentum: 0.000000
2023-10-18 14:42:13,361 epoch 4 - iter 91/138 - loss 0.55235315 - time (sec): 2.03 - samples/sec: 7340.93 - lr: 0.000035 - momentum: 0.000000
2023-10-18 14:42:13,652 epoch 4 - iter 104/138 - loss 0.54225433 - time (sec): 2.32 - samples/sec: 7371.82 - lr: 0.000035 - momentum: 0.000000
2023-10-18 14:42:13,943 epoch 4 - iter 117/138 - loss 0.55380140 - time (sec): 2.61 - samples/sec: 7419.91 - lr: 0.000034 - momentum: 0.000000
2023-10-18 14:42:14,254 epoch 4 - iter 130/138 - loss 0.55706438 - time (sec): 2.92 - samples/sec: 7440.75 - lr: 0.000034 - momentum: 0.000000
2023-10-18 14:42:14,418 ----------------------------------------------------------------------------------------------------
2023-10-18 14:42:14,418 EPOCH 4 done: loss 0.5489 - lr: 0.000034
2023-10-18 14:42:14,778 DEV : loss 0.41623571515083313 - f1-score (micro avg) 0.4254
2023-10-18 14:42:14,782 saving best model
2023-10-18 14:42:14,815 ----------------------------------------------------------------------------------------------------
2023-10-18 14:42:15,098 epoch 5 - iter 13/138 - loss 0.51600331 - time (sec): 0.28 - samples/sec: 7829.73 - lr: 0.000033 - momentum: 0.000000
2023-10-18 14:42:15,372 epoch 5 - iter 26/138 - loss 0.47856675 - time (sec): 0.56 - samples/sec: 7961.51 - lr: 0.000032 - momentum: 0.000000
2023-10-18 14:42:15,647 epoch 5 - iter 39/138 - loss 0.46722086 - time (sec): 0.83 - samples/sec: 7932.42 - lr: 0.000032 - momentum: 0.000000
2023-10-18 14:42:15,937 epoch 5 - iter 52/138 - loss 0.44101786 - time (sec): 1.12 - samples/sec: 7904.22 - lr: 0.000031 - momentum: 0.000000
2023-10-18 14:42:16,214 epoch 5 - iter 65/138 - loss 0.44921045 - time (sec): 1.40 - samples/sec: 7844.68 - lr: 0.000031 - momentum: 0.000000
2023-10-18 14:42:16,497 epoch 5 - iter 78/138 - loss 0.45145765 - time (sec): 1.68 - samples/sec: 7744.16 - lr: 0.000030 - momentum: 0.000000
2023-10-18 14:42:16,776 epoch 5 - iter 91/138 - loss 0.46361234 - time (sec): 1.96 - samples/sec: 7753.23 - lr: 0.000030 - momentum: 0.000000
2023-10-18 14:42:17,055 epoch 5 - iter 104/138 - loss 0.46610178 - time (sec): 2.24 - samples/sec: 7749.77 - lr: 0.000029 - momentum: 0.000000
2023-10-18 14:42:17,343 epoch 5 - iter 117/138 - loss 0.46796148 - time (sec): 2.53 - samples/sec: 7743.02 - lr: 0.000029 - momentum: 0.000000
2023-10-18 14:42:17,618 epoch 5 - iter 130/138 - loss 0.47257987 - time (sec): 2.80 - samples/sec: 7692.00 - lr: 0.000028 - momentum: 0.000000
2023-10-18 14:42:17,782 ----------------------------------------------------------------------------------------------------
2023-10-18 14:42:17,782 EPOCH 5 done: loss 0.4772 - lr: 0.000028
2023-10-18 14:42:18,146 DEV : loss 0.3648514449596405 - f1-score (micro avg) 0.5395
2023-10-18 14:42:18,150 saving best model
2023-10-18 14:42:18,183 ----------------------------------------------------------------------------------------------------
2023-10-18 14:42:18,471 epoch 6 - iter 13/138 - loss 0.51780980 - time (sec): 0.29 - samples/sec: 7259.23 - lr: 0.000027 - momentum: 0.000000
2023-10-18 14:42:18,753 epoch 6 - iter 26/138 - loss 0.49552102 - time (sec): 0.57 - samples/sec: 7316.09 - lr: 0.000027 - momentum: 0.000000
2023-10-18 14:42:19,038 epoch 6 - iter 39/138 - loss 0.45788993 - time (sec): 0.85 - samples/sec: 7657.67 - lr: 0.000026 - momentum: 0.000000
2023-10-18 14:42:19,334 epoch 6 - iter 52/138 - loss 0.44623002 - time (sec): 1.15 - samples/sec: 7795.73 - lr: 0.000026 - momentum: 0.000000
2023-10-18 14:42:19,597 epoch 6 - iter 65/138 - loss 0.45260018 - time (sec): 1.41 - samples/sec: 7670.40 - lr: 0.000025 - momentum: 0.000000
2023-10-18 14:42:19,882 epoch 6 - iter 78/138 - loss 0.45410773 - time (sec): 1.70 - samples/sec: 7727.93 - lr: 0.000025 - momentum: 0.000000
2023-10-18 14:42:20,169 epoch 6 - iter 91/138 - loss 0.46300618 - time (sec): 1.99 - samples/sec: 7637.85 - lr: 0.000024 - momentum: 0.000000
2023-10-18 14:42:20,457 epoch 6 - iter 104/138 - loss 0.44890161 - time (sec): 2.27 - samples/sec: 7750.57 - lr: 0.000024 - momentum: 0.000000
2023-10-18 14:42:20,730 epoch 6 - iter 117/138 - loss 0.44195316 - time (sec): 2.55 - samples/sec: 7704.86 - lr: 0.000023 - momentum: 0.000000
2023-10-18 14:42:21,002 epoch 6 - iter 130/138 - loss 0.44137833 - time (sec): 2.82 - samples/sec: 7716.56 - lr: 0.000023 - momentum: 0.000000
2023-10-18 14:42:21,166 ----------------------------------------------------------------------------------------------------
2023-10-18 14:42:21,166 EPOCH 6 done: loss 0.4430 - lr: 0.000023
2023-10-18 14:42:21,529 DEV : loss 0.3351803719997406 - f1-score (micro avg) 0.5734
2023-10-18 14:42:21,533 saving best model
2023-10-18 14:42:21,567 ----------------------------------------------------------------------------------------------------
2023-10-18 14:42:21,838 epoch 7 - iter 13/138 - loss 0.45504657 - time (sec): 0.27 - samples/sec: 8599.42 - lr: 0.000022 - momentum: 0.000000
2023-10-18 14:42:22,114 epoch 7 - iter 26/138 - loss 0.42379898 - time (sec): 0.55 - samples/sec: 8178.70 - lr: 0.000021 - momentum: 0.000000
2023-10-18 14:42:22,401 epoch 7 - iter 39/138 - loss 0.42184546 - time (sec): 0.83 - samples/sec: 8355.39 - lr: 0.000021 - momentum: 0.000000
2023-10-18 14:42:22,671 epoch 7 - iter 52/138 - loss 0.41310421 - time (sec): 1.10 - samples/sec: 8173.43 - lr: 0.000020 - momentum: 0.000000
2023-10-18 14:42:22,942 epoch 7 - iter 65/138 - loss 0.40306329 - time (sec): 1.37 - samples/sec: 8145.06 - lr: 0.000020 - momentum: 0.000000
2023-10-18 14:42:23,211 epoch 7 - iter 78/138 - loss 0.41050909 - time (sec): 1.64 - samples/sec: 8053.58 - lr: 0.000019 - momentum: 0.000000
2023-10-18 14:42:23,477 epoch 7 - iter 91/138 - loss 0.40607664 - time (sec): 1.91 - samples/sec: 7981.55 - lr: 0.000019 - momentum: 0.000000
2023-10-18 14:42:23,742 epoch 7 - iter 104/138 - loss 0.41363354 - time (sec): 2.18 - samples/sec: 7908.29 - lr: 0.000018 - momentum: 0.000000
2023-10-18 14:42:24,010 epoch 7 - iter 117/138 - loss 0.40978678 - time (sec): 2.44 - samples/sec: 7879.57 - lr: 0.000018 - momentum: 0.000000
2023-10-18 14:42:24,278 epoch 7 - iter 130/138 - loss 0.40647853 - time (sec): 2.71 - samples/sec: 7952.24 - lr: 0.000017 - momentum: 0.000000
2023-10-18 14:42:24,440 ----------------------------------------------------------------------------------------------------
2023-10-18 14:42:24,440 EPOCH 7 done: loss 0.4060 - lr: 0.000017
2023-10-18 14:42:24,805 DEV : loss 0.3176613450050354 - f1-score (micro avg) 0.5963
2023-10-18 14:42:24,809 saving best model
2023-10-18 14:42:24,844 ----------------------------------------------------------------------------------------------------
2023-10-18 14:42:25,126 epoch 8 - iter 13/138 - loss 0.37061849 - time (sec): 0.28 - samples/sec: 8751.34 - lr: 0.000016 - momentum: 0.000000
2023-10-18 14:42:25,412 epoch 8 - iter 26/138 - loss 0.38254673 - time (sec): 0.57 - samples/sec: 8278.59 - lr: 0.000016 - momentum: 0.000000
2023-10-18 14:42:25,678 epoch 8 - iter 39/138 - loss 0.40051285 - time (sec): 0.83 - samples/sec: 8179.66 - lr: 0.000015 - momentum: 0.000000
2023-10-18 14:42:25,963 epoch 8 - iter 52/138 - loss 0.38377615 - time (sec): 1.12 - samples/sec: 8086.52 - lr: 0.000015 - momentum: 0.000000
2023-10-18 14:42:26,235 epoch 8 - iter 65/138 - loss 0.40052864 - time (sec): 1.39 - samples/sec: 8033.92 - lr: 0.000014 - momentum: 0.000000
2023-10-18 14:42:26,504 epoch 8 - iter 78/138 - loss 0.38868870 - time (sec): 1.66 - samples/sec: 7934.58 - lr: 0.000014 - momentum: 0.000000
2023-10-18 14:42:26,772 epoch 8 - iter 91/138 - loss 0.38663293 - time (sec): 1.93 - samples/sec: 7905.67 - lr: 0.000013 - momentum: 0.000000
2023-10-18 14:42:27,069 epoch 8 - iter 104/138 - loss 0.38164637 - time (sec): 2.22 - samples/sec: 7887.18 - lr: 0.000013 - momentum: 0.000000
2023-10-18 14:42:27,339 epoch 8 - iter 117/138 - loss 0.38354103 - time (sec): 2.49 - samples/sec: 7842.09 - lr: 0.000012 - momentum: 0.000000
2023-10-18 14:42:27,621 epoch 8 - iter 130/138 - loss 0.39513402 - time (sec): 2.78 - samples/sec: 7839.40 - lr: 0.000012 - momentum: 0.000000
2023-10-18 14:42:27,794 ----------------------------------------------------------------------------------------------------
2023-10-18 14:42:27,795 EPOCH 8 done: loss 0.3910 - lr: 0.000012
2023-10-18 14:42:28,167 DEV : loss 0.31104806065559387 - f1-score (micro avg) 0.6037
2023-10-18 14:42:28,170 saving best model
2023-10-18 14:42:28,205 ----------------------------------------------------------------------------------------------------
2023-10-18 14:42:28,482 epoch 9 - iter 13/138 - loss 0.37846411 - time (sec): 0.28 - samples/sec: 7982.85 - lr: 0.000011 - momentum: 0.000000
2023-10-18 14:42:28,769 epoch 9 - iter 26/138 - loss 0.38086453 - time (sec): 0.56 - samples/sec: 7892.30 - lr: 0.000010 - momentum: 0.000000
2023-10-18 14:42:29,043 epoch 9 - iter 39/138 - loss 0.38465865 - time (sec): 0.84 - samples/sec: 7663.38 - lr: 0.000010 - momentum: 0.000000
2023-10-18 14:42:29,325 epoch 9 - iter 52/138 - loss 0.39216186 - time (sec): 1.12 - samples/sec: 7834.07 - lr: 0.000009 - momentum: 0.000000
2023-10-18 14:42:29,617 epoch 9 - iter 65/138 - loss 0.37224861 - time (sec): 1.41 - samples/sec: 7853.01 - lr: 0.000009 - momentum: 0.000000
2023-10-18 14:42:29,890 epoch 9 - iter 78/138 - loss 0.37099734 - time (sec): 1.68 - samples/sec: 7753.02 - lr: 0.000008 - momentum: 0.000000
2023-10-18 14:42:30,173 epoch 9 - iter 91/138 - loss 0.37458075 - time (sec): 1.97 - samples/sec: 7652.24 - lr: 0.000008 - momentum: 0.000000
2023-10-18 14:42:30,471 epoch 9 - iter 104/138 - loss 0.37220923 - time (sec): 2.27 - samples/sec: 7589.93 - lr: 0.000007 - momentum: 0.000000
2023-10-18 14:42:30,770 epoch 9 - iter 117/138 - loss 0.37475106 - time (sec): 2.57 - samples/sec: 7525.10 - lr: 0.000007 - momentum: 0.000000
2023-10-18 14:42:31,078 epoch 9 - iter 130/138 - loss 0.37416561 - time (sec): 2.87 - samples/sec: 7465.28 - lr: 0.000006 - momentum: 0.000000
2023-10-18 14:42:31,263 ----------------------------------------------------------------------------------------------------
2023-10-18 14:42:31,263 EPOCH 9 done: loss 0.3717 - lr: 0.000006
2023-10-18 14:42:31,636 DEV : loss 0.30265966057777405 - f1-score (micro avg) 0.6075
2023-10-18 14:42:31,640 saving best model
2023-10-18 14:42:31,674 ----------------------------------------------------------------------------------------------------
2023-10-18 14:42:31,962 epoch 10 - iter 13/138 - loss 0.37059266 - time (sec): 0.29 - samples/sec: 7765.81 - lr: 0.000005 - momentum: 0.000000
2023-10-18 14:42:32,246 epoch 10 - iter 26/138 - loss 0.35812253 - time (sec): 0.57 - samples/sec: 7516.30 - lr: 0.000005 - momentum: 0.000000
2023-10-18 14:42:32,517 epoch 10 - iter 39/138 - loss 0.35347996 - time (sec): 0.84 - samples/sec: 7648.29 - lr: 0.000004 - momentum: 0.000000
2023-10-18 14:42:32,794 epoch 10 - iter 52/138 - loss 0.36939625 - time (sec): 1.12 - samples/sec: 7761.30 - lr: 0.000004 - momentum: 0.000000
2023-10-18 14:42:33,074 epoch 10 - iter 65/138 - loss 0.35597407 - time (sec): 1.40 - samples/sec: 7838.04 - lr: 0.000003 - momentum: 0.000000
2023-10-18 14:42:33,358 epoch 10 - iter 78/138 - loss 0.36026633 - time (sec): 1.68 - samples/sec: 7807.03 - lr: 0.000003 - momentum: 0.000000
2023-10-18 14:42:33,638 epoch 10 - iter 91/138 - loss 0.36485483 - time (sec): 1.96 - samples/sec: 7695.86 - lr: 0.000002 - momentum: 0.000000
2023-10-18 14:42:33,917 epoch 10 - iter 104/138 - loss 0.36302472 - time (sec): 2.24 - samples/sec: 7674.94 - lr: 0.000002 - momentum: 0.000000
2023-10-18 14:42:34,187 epoch 10 - iter 117/138 - loss 0.36402170 - time (sec): 2.51 - samples/sec: 7667.48 - lr: 0.000001 - momentum: 0.000000
2023-10-18 14:42:34,465 epoch 10 - iter 130/138 - loss 0.37068586 - time (sec): 2.79 - samples/sec: 7681.11 - lr: 0.000000 - momentum: 0.000000
2023-10-18 14:42:34,629 ----------------------------------------------------------------------------------------------------
2023-10-18 14:42:34,629 EPOCH 10 done: loss 0.3692 - lr: 0.000000
2023-10-18 14:42:34,999 DEV : loss 0.29817959666252136 - f1-score (micro avg) 0.6122
2023-10-18 14:42:35,003 saving best model
2023-10-18 14:42:35,072 ----------------------------------------------------------------------------------------------------
2023-10-18 14:42:35,072 Loading model from best epoch ...
2023-10-18 14:42:35,159 SequenceTagger predicts: Dictionary with 25 tags: O, S-scope, B-scope, E-scope, I-scope, S-pers, B-pers, E-pers, I-pers, S-work, B-work, E-work, I-work, S-loc, B-loc, E-loc, I-loc, S-object, B-object, E-object, I-object, S-date, B-date, E-date, I-date
2023-10-18 14:42:35,449
Results:
- F-score (micro) 0.6242
- F-score (macro) 0.3696
- Accuracy 0.459
By class:
precision recall f1-score support
scope 0.6149 0.6080 0.6114 176
pers 0.8148 0.6875 0.7458 128
work 0.4494 0.5405 0.4908 74
object 0.0000 0.0000 0.0000 2
loc 0.0000 0.0000 0.0000 2
micro avg 0.6334 0.6152 0.6242 382
macro avg 0.3758 0.3672 0.3696 382
weighted avg 0.6434 0.6152 0.6267 382
2023-10-18 14:42:35,449 ----------------------------------------------------------------------------------------------------