arthurmluz's picture
Model save
c67ab6c
metadata
license: mit
base_model: arthurmluz/ptt5-wikilingua-30epochs
tags:
  - generated_from_trainer
metrics:
  - rouge
model-index:
  - name: ptt5-wikilingua-cstnews-1024
    results: []

ptt5-wikilingua-cstnews-1024

This model is a fine-tuned version of arthurmluz/ptt5-wikilingua-30epochs on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 1.1062
  • Rouge1: 0.2634
  • Rouge2: 0.2049
  • Rougel: 0.2461
  • Rougelsum: 0.2619
  • Gen Len: 18.871

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 2
  • eval_batch_size: 2
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 30

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len
No log 1.0 47 1.3346 0.1918 0.1194 0.161 0.1857 18.6452
No log 2.0 94 1.2279 0.2326 0.1626 0.1996 0.2285 18.871
No log 3.0 141 1.1763 0.2424 0.1692 0.2117 0.2381 18.871
No log 4.0 188 1.1496 0.2467 0.1761 0.2129 0.2347 18.871
1.727 5.0 235 1.1292 0.2579 0.188 0.2317 0.2515 18.871
1.727 6.0 282 1.1190 0.2516 0.1834 0.2256 0.2435 18.871
1.727 7.0 329 1.1059 0.2476 0.1806 0.2227 0.2404 18.871
1.727 8.0 376 1.0990 0.2468 0.1797 0.2217 0.2391 18.871
1.3202 9.0 423 1.0910 0.2587 0.1902 0.2339 0.2541 18.871
1.3202 10.0 470 1.0882 0.259 0.1908 0.2343 0.2545 18.871
1.3202 11.0 517 1.0885 0.2539 0.1886 0.2324 0.2505 18.871
1.3202 12.0 564 1.0883 0.2601 0.1957 0.2399 0.2582 18.871
1.1493 13.0 611 1.0881 0.2607 0.1962 0.2403 0.2583 18.871
1.1493 14.0 658 1.0866 0.2624 0.1976 0.2433 0.26 18.871
1.1493 15.0 705 1.0875 0.2641 0.2072 0.2477 0.2626 18.871
1.1493 16.0 752 1.0897 0.2641 0.2069 0.2477 0.2626 18.871
1.1493 17.0 799 1.0913 0.2641 0.2072 0.2477 0.2626 18.871
1.0308 18.0 846 1.0927 0.2634 0.2063 0.2474 0.2618 18.871
1.0308 19.0 893 1.0977 0.2634 0.2059 0.2473 0.2618 18.871
1.0308 20.0 940 1.0976 0.2634 0.2059 0.2473 0.2618 18.871
1.0308 21.0 987 1.0993 0.2632 0.2059 0.247 0.2616 18.871
0.9401 22.0 1034 1.1000 0.2634 0.2049 0.2461 0.2619 18.871
0.9401 23.0 1081 1.0997 0.2634 0.2049 0.2461 0.2619 18.871
0.9401 24.0 1128 1.1018 0.2634 0.2049 0.2461 0.2619 18.871
0.9401 25.0 1175 1.1045 0.2634 0.2049 0.2461 0.2619 18.871
0.8857 26.0 1222 1.1057 0.2634 0.2049 0.2461 0.2619 18.871
0.8857 27.0 1269 1.1061 0.2634 0.2049 0.2461 0.2619 18.871
0.8857 28.0 1316 1.1062 0.2634 0.2049 0.2461 0.2619 18.871
0.8857 29.0 1363 1.1061 0.2634 0.2049 0.2461 0.2619 18.871
0.8716 30.0 1410 1.1062 0.2634 0.2049 0.2461 0.2619 18.871

Framework versions

  • Transformers 4.34.0
  • Pytorch 2.0.1+cu117
  • Datasets 2.14.5
  • Tokenizers 0.14.1