bge_large_ja_llama3_70

This model was trained from scratch on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 1.0015
Precision: 0.4921
Recall: 0.3553
F1 Macro: 0.3419
Accuracy: 0.4442

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0003
train_batch_size: 256
eval_batch_size: 128
seed: 0
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 10

Training results

Training Loss	Epoch	Step	Validation Loss	Precision	Recall	F1 Macro	Accuracy
No log	0	0	11.2074	0.0112	0.1667	0.0210	0.0673
1.1011	0.1575	1000	1.1103	0.4472	0.3212	0.3124	0.4230
1.0928	0.3150	2000	1.0853	0.4367	0.3324	0.3258	0.4117
1.0651	0.4724	3000	1.0764	0.4678	0.3383	0.3363	0.4380
1.0585	0.6299	4000	1.0546	0.4620	0.3380	0.3232	0.4267
1.0513	0.7874	5000	1.0633	0.4558	0.3414	0.3283	0.4138
1.0411	0.9449	6000	1.0415	0.4700	0.3430	0.3296	0.4334
1.0391	1.1024	7000	1.0445	0.4852	0.3415	0.3220	0.4227
1.0373	1.2598	8000	1.0378	0.4815	0.3483	0.3342	0.4330
1.039	1.4173	9000	1.0394	0.4762	0.3432	0.3273	0.4265
1.0408	1.5748	10000	1.0313	0.4992	0.3416	0.3253	0.4375
1.0274	1.7323	11000	1.0287	0.4959	0.3429	0.3265	0.4350
1.0296	1.8898	12000	1.0346	0.4822	0.3450	0.3278	0.4257
1.0404	2.0472	13000	1.0310	0.4844	0.3456	0.3349	0.4433
1.0194	2.2047	14000	1.0234	0.4828	0.3487	0.3325	0.4331
1.0088	2.3622	15000	1.0236	0.4813	0.3464	0.3315	0.4393
1.0136	2.5197	16000	1.0215	0.4986	0.3432	0.3247	0.4400
1.046	2.6772	17000	1.0194	0.4953	0.3455	0.3306	0.4412
1.0133	2.8346	18000	1.0202	0.4843	0.3488	0.3342	0.4428
1.0096	2.9921	19000	1.0189	0.4915	0.3478	0.3319	0.4338
0.9948	3.1496	20000	1.0146	0.4964	0.3487	0.3292	0.4358
1.0041	3.3071	21000	1.0174	0.4640	0.3510	0.3390	0.4335
1.0039	3.4646	22000	1.0211	0.4621	0.3493	0.3347	0.4304
1.0286	3.6220	23000	1.0127	0.5012	0.3484	0.3312	0.4398
1.0068	3.7795	24000	1.0183	0.5036	0.3451	0.3298	0.4475
1.0082	3.9370	25000	1.0128	0.4801	0.3513	0.3361	0.4377
1.0013	4.0945	26000	1.0219	0.4976	0.3470	0.3375	0.4465
1.0045	4.2520	27000	1.0123	0.5015	0.3493	0.3360	0.4447
1.0051	4.4094	28000	1.0128	0.5018	0.3488	0.3346	0.4453
1.0176	4.5669	29000	1.0135	0.4759	0.3520	0.3349	0.4357
1.0002	4.7244	30000	1.0109	0.4927	0.3484	0.3329	0.4439
0.9972	4.8819	31000	1.0143	0.4823	0.3517	0.3382	0.4337
0.9907	5.0394	32000	1.0096	0.4955	0.3507	0.3371	0.4433
0.9546	5.1969	33000	1.0099	0.4847	0.3586	0.3497	0.4420
0.9973	5.3543	34000	1.0100	0.4911	0.3494	0.3327	0.4426
0.9939	5.5118	35000	1.0267	0.4651	0.3511	0.3322	0.4220
0.9915	5.6693	36000	1.0078	0.4861	0.3553	0.3464	0.4452
1.0101	5.8268	37000	1.0070	0.4952	0.3552	0.3441	0.4441
0.9869	5.9843	38000	1.0076	0.4970	0.3547	0.3434	0.4447
0.9797	6.1417	39000	1.0063	0.4946	0.3520	0.3351	0.4430
0.9783	6.2992	40000	1.0114	0.4984	0.3542	0.3445	0.4484
1.0314	6.4567	41000	1.0059	0.4927	0.3521	0.3369	0.4414
0.9764	6.6142	42000	1.0049	0.4976	0.3520	0.3364	0.4438
0.9762	6.7717	43000	1.0056	0.4935	0.3539	0.3425	0.4456
1.0073	6.9291	44000	1.0053	0.4774	0.3546	0.3419	0.4395
0.9764	7.0866	45000	1.0054	0.4871	0.3547	0.3401	0.4398
0.9795	7.2441	46000	1.0066	0.4928	0.3562	0.3458	0.4456
0.9707	7.4016	47000	1.0039	0.4905	0.3544	0.3418	0.4434
0.9681	7.5591	48000	1.0042	0.4873	0.3532	0.3381	0.4441
0.982	7.7165	49000	1.0035	0.4870	0.3539	0.3374	0.4412
0.9967	7.8740	50000	1.0040	0.4762	0.3565	0.3464	0.4437
0.9871	8.0315	51000	1.0075	0.5050	0.3523	0.3406	0.4480
0.9654	8.1890	52000	1.0038	0.4956	0.3518	0.3358	0.4439
0.9897	8.3465	53000	1.0035	0.4970	0.3512	0.3366	0.4440
0.9958	8.5039	54000	1.0069	0.4961	0.3536	0.3415	0.4478
0.9969	8.6614	55000	1.0033	0.4897	0.3542	0.3413	0.4456
0.9899	8.8189	56000	1.0023	0.4847	0.3542	0.3409	0.4426
0.9766	8.9764	57000	1.0051	0.4963	0.3546	0.3438	0.4481
0.9827	9.1339	58000	1.0031	0.4867	0.3557	0.3425	0.4404
0.9878	9.2913	59000	1.0029	0.4958	0.3536	0.3411	0.4460
0.966	9.4488	60000	1.0020	0.4943	0.3547	0.3409	0.4456
0.9769	9.6063	61000	1.0022	0.4913	0.3555	0.3435	0.4449
0.9808	9.7638	62000	1.0019	0.4926	0.3553	0.3406	0.4424
0.9934	9.9213	63000	1.0015	0.4921	0.3553	0.3419	0.4442

Framework versions

Transformers 4.43.3
Pytorch 2.4.0+cu121
Datasets 2.20.0
Tokenizers 0.19.1

Snowkcon
/

bge_large_ja_llama3_70

bge_large_ja_llama3_70

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Evaluation results