w

This model is a fine-tuned version of facebook/w2v-bert-2.0 on the Grain dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

Training Loss	Epoch	Step	Validation Loss	Wer	Cer
0.2656	1.0	1296	0.0886	0.0909	0.0165
0.0752	2.0	2592	0.0589	0.0620	0.0117
0.0529	3.0	3888	0.0448	0.0408	0.0081
0.0391	4.0	5184	0.0409	0.0374	0.0073
0.032	5.0	6480	0.0323	0.0299	0.0058
0.0268	6.0	7776	0.0326	0.0348	0.0065
0.0234	7.0	9072	0.0236	0.0243	0.0050
0.0207	8.0	10368	0.0228	0.0289	0.0057
0.0179	9.0	11664	0.0235	0.0240	0.0048
0.0163	10.0	12960	0.0268	0.0280	0.0054
0.0157	11.0	14256	0.0258	0.0352	0.0067
0.0125	12.0	15552	0.0205	0.0221	0.0046
0.0116	13.0	16848	0.0187	0.0161	0.0035
0.0113	14.0	18144	0.0193	0.0215	0.0041
0.0111	15.0	19440	0.0185	0.0209	0.0041
0.01	16.0	20736	0.0188	0.0191	0.0038
0.0098	17.0	22032	0.0132	0.0143	0.0027
0.0082	18.0	23328	0.0155	0.0161	0.0032
0.0077	19.0	24624	0.0180	0.0214	0.0041
0.0073	20.0	25920	0.0170	0.0145	0.0029
0.0075	21.0	27216	0.0134	0.0170	0.0030
0.0067	22.0	28512	0.0120	0.0130	0.0026
0.0061	23.0	29808	0.0125	0.0155	0.0031
0.0054	24.0	31104	0.0141	0.0130	0.0024
0.0051	25.0	32400	0.0134	0.0109	0.0022
0.0052	26.0	33696	0.0103	0.0108	0.0022
0.0046	27.0	34992	0.0092	0.0095	0.0018
0.004	28.0	36288	0.0140	0.0123	0.0023
0.004	29.0	37584	0.0110	0.0133	0.0024
0.0035	30.0	38880	0.0110	0.0103	0.0021
0.0035	31.0	40176	0.0101	0.0064	0.0016
0.0035	32.0	41472	0.0148	0.0124	0.0024
0.003	33.0	42768	0.0090	0.0053	0.0012
0.0031	34.0	44064	0.0096	0.0073	0.0015
0.0032	35.0	45360	0.0071	0.0057	0.0011
0.0025	36.0	46656	0.0097	0.0078	0.0017
0.0023	37.0	47952	0.0116	0.0066	0.0014
0.0024	38.0	49248	0.0087	0.0076	0.0015
0.003	39.0	50544	0.0098	0.0074	0.0015
0.002	40.0	51840	0.0122	0.0108	0.0019
0.0017	41.0	53136	0.0089	0.0054	0.0012
0.0018	42.0	54432	0.0094	0.0064	0.0015
0.0019	43.0	55728	0.0084	0.0055	0.0011