bert-small-uncased-tajik-ner

This model is a fine-tuned version of google/bert_uncased_L-4_H-512_A-8 on the wikiann dataset. It achieves the following results on the evaluation set:

Loss: 1.1663
Precision: 0.4388
Recall: 0.5865
F1: 0.5021
Accuracy: 0.8270

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 2e-05
train_batch_size: 8
eval_batch_size: 8
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 200

Training results

Training Loss	Epoch	Step	Validation Loss	Precision	Recall	F1	Accuracy
No log	2.0	50	1.1113	0.0238	0.0481	0.0318	0.5984
No log	4.0	100	0.9179	0.0976	0.1538	0.1194	0.6547
No log	6.0	150	0.9254	0.08	0.1538	0.1053	0.6634
No log	8.0	200	0.6607	0.1299	0.2212	0.1637	0.7707
No log	10.0	250	0.6514	0.2583	0.375	0.3059	0.7896
No log	12.0	300	0.6213	0.2836	0.3654	0.3193	0.8058
No log	14.0	350	0.6696	0.3611	0.5	0.4194	0.8100
No log	16.0	400	0.7094	0.3893	0.4904	0.4340	0.8187
No log	18.0	450	0.7557	0.38	0.5481	0.4488	0.8243
0.5061	20.0	500	0.7409	0.4222	0.5481	0.4770	0.8342
0.5061	22.0	550	0.8003	0.4196	0.5769	0.4858	0.8349
0.5061	24.0	600	0.8173	0.4275	0.5673	0.4876	0.8342
0.5061	26.0	650	0.7942	0.4225	0.5769	0.4878	0.8323
0.5061	28.0	700	0.8565	0.4067	0.5865	0.4803	0.8281
0.5061	30.0	750	0.8040	0.4388	0.5865	0.5021	0.8406
0.5061	32.0	800	0.9251	0.4286	0.5769	0.4918	0.8368
0.5061	34.0	850	0.8421	0.4196	0.5769	0.4858	0.8394
0.5061	36.0	900	0.8608	0.4207	0.5865	0.4900	0.8330
0.5061	38.0	950	0.8622	0.5333	0.6154	0.5714	0.8489
0.0304	40.0	1000	0.9901	0.4306	0.5962	0.5000	0.8240
0.0304	42.0	1050	0.9677	0.4286	0.6058	0.5020	0.8345
0.0304	44.0	1100	0.9203	0.4429	0.5962	0.5082	0.8440
0.0304	46.0	1150	0.9368	0.4559	0.5962	0.5167	0.8428
0.0304	48.0	1200	0.9747	0.4420	0.5865	0.5041	0.8342
0.0304	50.0	1250	0.9033	0.4266	0.5865	0.4939	0.8360
0.0304	52.0	1300	0.9242	0.4806	0.5962	0.5322	0.8519
0.0304	54.0	1350	0.9496	0.4150	0.5865	0.4861	0.8406
0.0304	56.0	1400	1.0157	0.4388	0.5865	0.5021	0.8274
0.0304	58.0	1450	1.0069	0.3789	0.5865	0.4604	0.8357
0.0041	60.0	1500	1.0159	0.4593	0.5962	0.5188	0.8413
0.0041	62.0	1550	1.0138	0.488	0.5865	0.5328	0.8428
0.0041	64.0	1600	1.0406	0.4526	0.5962	0.5145	0.8398
0.0041	66.0	1650	1.0672	0.504	0.6058	0.5502	0.8413
0.0041	68.0	1700	1.0713	0.4257	0.6058	0.5	0.8334
0.0041	70.0	1750	1.0001	0.5079	0.6154	0.5565	0.8515
0.0041	72.0	1800	0.9986	0.4632	0.6058	0.525	0.8451
0.0041	74.0	1850	1.0523	0.4643	0.625	0.5328	0.8357
0.0041	76.0	1900	1.1331	0.4437	0.6058	0.5122	0.8281
0.0041	78.0	1950	1.0217	0.4667	0.6058	0.5272	0.8406
0.0023	80.0	2000	1.0296	0.4519	0.5865	0.5105	0.8372
0.0023	82.0	2050	1.0603	0.5207	0.6058	0.56	0.8512
0.0023	84.0	2100	1.1181	0.4733	0.5962	0.5277	0.8319
0.0023	86.0	2150	1.0858	0.4701	0.6058	0.5294	0.8383
0.0023	88.0	2200	1.0947	0.4779	0.625	0.5417	0.8394
0.0023	90.0	2250	1.0671	0.4539	0.6154	0.5224	0.8391
0.0023	92.0	2300	1.0958	0.4444	0.6154	0.5161	0.8372
0.0023	94.0	2350	1.1221	0.4397	0.5962	0.5061	0.8319
0.0023	96.0	2400	1.0861	0.5	0.6058	0.5478	0.8508
0.0023	98.0	2450	1.1522	0.4545	0.5769	0.5085	0.8258
0.0015	100.0	2500	1.1426	0.4688	0.5769	0.5172	0.8304
0.0015	102.0	2550	1.1663	0.4388	0.5865	0.5021	0.8270

Framework versions

Transformers 4.21.2
Pytorch 1.12.1+cu113
Datasets 2.4.0
Tokenizers 0.12.1

muhtasham
/

bert-small-uncased-tajik-ner

bert-small-uncased-tajik-ner

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Dataset used to train muhtasham/bert-small-uncased-tajik-ner

Collection including muhtasham/bert-small-uncased-tajik-ner

Tajik Language Models

Evaluation results