ptt5-wikilingua-1024

This model is a fine-tuned version of unicamp-dl/ptt5-base-portuguese-vocab on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 1.8346
Rouge1: 26.0293
Rouge2: 11.2397
Rougel: 22.2357
Rougelsum: 25.393
Gen Len: 18.4771

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 2e-05
train_batch_size: 2
eval_batch_size: 2
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 30

Training results

Training Loss	Epoch	Step	Gen Len	Validation Loss	Rouge1	Rouge2	Rougel	Rougelsum
2.0816	1.0	28580	18.168	1.9680	23.8868	9.5463	20.5459	23.3329
1.9469	2.0	57160	18.3899	1.9000	24.7191	10.1206	21.165	24.1349
1.9482	3.0	85740	18.3192	1.8655	24.9016	10.3913	21.325	24.3342
1.808	4.0	114320	18.3628	1.8422	25.3346	10.7062	21.6946	24.764
1.7811	5.0	142900	18.2808	1.8304	25.3047	10.7773	21.7447	24.7572
1.7676	6.0	171480	18.3839	1.8161	25.5816	10.9429	21.9072	24.9958
1.6651	7.0	200060	18.3506	1.8081	25.5281	10.9006	21.8813	24.9432
1.6461	8.0	228640	18.39	1.8047	25.6912	10.9803	21.9881	25.1059
1.6942	9.0	257220	18.3609	1.8004	25.7941	11.0952	22.1048	25.2158
1.6389	10.0	285800	18.3792	1.7971	25.8327	11.1257	22.1268	25.2338
1.6152	11.0	314380	18.4059	1.7964	25.7519	11.1059	22.1061	25.178
1.6127	12.0	342960	18.3953	1.7974	25.9198	11.218	22.2459	25.3411
1.5946	13.0	371540	18.4025	1.8020	26.0687	11.3053	22.3127	25.4836
1.5988	14.0	400120	18.4376	1.8034	25.9518	11.1943	22.233	25.3327
1.5474	15.0	428700	18.4397	1.8008	26.0176	11.2425	22.2723	25.4065
1.5135	16.0	457280	18.441	1.7997	26.0409	11.2593	22.2739	25.4333
1.563	17.0	485860	18.4556	1.8130	26.0385	11.2479	22.2757	25.4155
1.4997	18.0	514440	18.4048	1.8098	25.9907	11.2433	22.2378	25.3589
1.4414	19.0	543020	18.4738	1.8161	26.0156	11.209	22.2514	25.3623
1.4487	20.0	571600	18.4353	1.8128	26.0583	11.2856	22.2673	25.4279
1.4434	21.0	600180	18.448	1.8189	25.9673	11.2448	22.1904	25.3287
1.4699	22.0	628760	18.4698	1.8188	26.0581	11.288	22.2603	25.4347
1.4282	23.0	657340	18.4548	1.8235	25.9654	11.1782	22.2008	25.3327
1.4411	24.0	685920	18.4547	1.8265	26.1178	11.3101	22.3081	25.474
1.3912	25.0	714500	1.8309	26.0667	11.2725	22.2863	25.4394	18.4705
1.4061	26.0	743080	1.8309	26.0472	11.2591	22.2589	25.4179	18.4803
1.4594	27.0	771660	1.8289	26.0164	11.2367	22.2239	25.3929	18.4811
1.3836	28.0	800240	1.8323	26.0416	11.2521	22.2303	25.4106	18.4734
1.4051	29.0	828820	1.8349	26.0081	11.2332	22.213	25.3822	18.4797
1.3833	30.0	857400	1.8346	26.0293	11.2397	22.2357	25.393	18.4771

Framework versions

Transformers 4.34.0
Pytorch 2.0.1+cu117
Datasets 2.14.5
Tokenizers 0.14.1

arthurmluz
/

ptt5-wikilingua-1024

ptt5-wikilingua-1024

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for arthurmluz/ptt5-wikilingua-1024

Evaluation results