ptt5-cstnews-1024

This model is a fine-tuned version of unicamp-dl/ptt5-base-portuguese-vocab on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 2.4269
Rouge1: 0.086
Rouge2: 0.0557
Rougel: 0.0758
Rougelsum: 0.0833
Gen Len: 19.0

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 2e-05
train_batch_size: 2
eval_batch_size: 2
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 30

Training results

Training Loss	Epoch	Step	Validation Loss	Rouge1	Rouge2	Rougel	Rougelsum	Gen Len
No log	1.0	88	3.0796	0.0605	0.0381	0.0539	0.0582	19.0
No log	2.0	176	2.7409	0.0841	0.0507	0.0737	0.0817	19.0
3.8249	3.0	264	2.6518	0.0866	0.0508	0.0747	0.0833	19.0
3.8249	4.0	352	2.5961	0.0869	0.0526	0.0758	0.0839	19.0
2.7351	5.0	440	2.5584	0.0869	0.0539	0.0759	0.0841	19.0
2.7351	6.0	528	2.5364	0.0858	0.0521	0.0743	0.0826	19.0
2.5802	7.0	616	2.5092	0.0847	0.0516	0.0736	0.0815	19.0
2.5802	8.0	704	2.5026	0.0855	0.055	0.075	0.0827	19.0
2.5802	9.0	792	2.4862	0.0852	0.0551	0.0749	0.0825	19.0
2.4864	10.0	880	2.4744	0.0853	0.0553	0.0751	0.0826	19.0
2.4864	11.0	968	2.4676	0.0871	0.0561	0.0764	0.0843	19.0
2.4328	12.0	1056	2.4627	0.0865	0.0561	0.0763	0.0837	19.0
2.4328	13.0	1144	2.4566	0.0877	0.0562	0.0765	0.0846	19.0
2.3615	14.0	1232	2.4495	0.0869	0.0559	0.0761	0.0842	19.0
2.3615	15.0	1320	2.4439	0.0869	0.0559	0.0761	0.0842	19.0
2.2926	16.0	1408	2.4447	0.0869	0.0559	0.0761	0.0842	19.0
2.2926	17.0	1496	2.4437	0.0866	0.0555	0.0759	0.0839	19.0
2.2926	18.0	1584	2.4345	0.0862	0.0557	0.076	0.0834	19.0
2.2657	19.0	1672	2.4342	0.0871	0.056	0.0764	0.0843	19.0
2.2657	20.0	1760	2.4328	0.0871	0.056	0.0764	0.0843	19.0
2.2425	21.0	1848	2.4317	0.0863	0.0558	0.0761	0.0836	19.0
2.2425	22.0	1936	2.4311	0.0863	0.0558	0.0761	0.0836	19.0
2.2338	23.0	2024	2.4292	0.0863	0.0558	0.0761	0.0836	19.0
2.2338	24.0	2112	2.4268	0.0861	0.056	0.0759	0.0836	19.0
2.203	25.0	2200	2.4270	0.0857	0.0559	0.0756	0.0833	19.0
2.203	26.0	2288	2.4290	0.0857	0.0557	0.0756	0.0833	19.0
2.203	27.0	2376	2.4272	0.0857	0.0557	0.0756	0.0833	19.0
2.1676	28.0	2464	2.4265	0.0857	0.0557	0.0756	0.0833	19.0
2.1676	29.0	2552	2.4273	0.086	0.0557	0.0758	0.0833	19.0
2.1984	30.0	2640	2.4269	0.086	0.0557	0.0758	0.0833	19.0

Framework versions

Transformers 4.34.0
Pytorch 2.0.1+cu117
Datasets 2.14.5
Tokenizers 0.14.1

arthurmluz
/

ptt5-cstnews-1024

ptt5-cstnews-1024

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for arthurmluz/ptt5-cstnews-1024

Evaluation results