Edit model card

flan-t5-rouge-durga-q5-clean-4f

This model is a fine-tuned version of google/flan-t5-base on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0021
  • Rouge1: 0.7371
  • Rouge2: 0.7114
  • Rougel: 0.7373
  • Rougelsum: 0.7377

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0003
  • train_batch_size: 24
  • eval_batch_size: 24
  • seed: 42
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • num_epochs: 60

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum
2.0584 1.0 9 1.6093 0.2821 0.0871 0.2751 0.2760
1.9958 2.0 18 1.1569 0.3267 0.1036 0.3184 0.3195
1.174 3.0 27 0.8836 0.3765 0.1667 0.3660 0.3668
1.1673 4.0 36 0.6420 0.3653 0.1586 0.3574 0.3582
1.0302 5.0 45 0.4727 0.3987 0.2228 0.3942 0.3944
0.6135 6.0 54 0.3187 0.4170 0.2446 0.4106 0.4112
0.5838 7.0 63 0.2294 0.4530 0.2996 0.4465 0.4472
0.4479 8.0 72 0.1891 0.4614 0.3185 0.4572 0.4574
0.3936 9.0 81 0.1373 0.4651 0.3179 0.4619 0.4622
0.3307 10.0 90 0.1073 0.5070 0.3895 0.5066 0.5076
0.3624 11.0 99 0.0845 0.5060 0.3903 0.5062 0.5063
0.1817 12.0 108 0.0702 0.5443 0.4428 0.5447 0.5450
0.2335 13.0 117 0.0705 0.5125 0.4081 0.5116 0.5119
0.1604 14.0 126 0.0650 0.5452 0.4443 0.5461 0.5451
0.1306 15.0 135 0.0540 0.5463 0.4521 0.5474 0.5474
0.1194 16.0 144 0.0489 0.5922 0.5120 0.5932 0.5917
0.2133 17.0 153 0.0441 0.5739 0.4873 0.5728 0.5737
0.1035 18.0 162 0.0425 0.5791 0.4981 0.5784 0.5789
0.1049 19.0 171 0.0333 0.6326 0.5635 0.6334 0.6332
0.1165 20.0 180 0.0287 0.6387 0.5769 0.6380 0.6388
0.1197 21.0 189 0.0300 0.5980 0.5240 0.5990 0.5998
0.0607 22.0 198 0.0245 0.6445 0.5833 0.6455 0.6451
0.1443 23.0 207 0.0238 0.6438 0.5828 0.6456 0.6462
0.0727 24.0 216 0.0188 0.6747 0.6253 0.6774 0.6764
0.0462 25.0 225 0.0177 0.6914 0.6391 0.6921 0.6912
0.0804 26.0 234 0.0132 0.6967 0.6520 0.6967 0.6985
0.0337 27.0 243 0.0135 0.6955 0.6475 0.6961 0.6961
0.0459 28.0 252 0.0131 0.7002 0.6584 0.7019 0.7020
0.0233 29.0 261 0.0102 0.7074 0.6665 0.7080 0.7095
0.0228 30.0 270 0.0112 0.7040 0.6644 0.7044 0.7052
0.0435 31.0 279 0.0080 0.7115 0.6724 0.7119 0.7123
0.0364 32.0 288 0.0114 0.7082 0.6666 0.7100 0.7095
0.0112 33.0 297 0.0086 0.7165 0.6787 0.7177 0.7174
0.0325 34.0 306 0.0068 0.7251 0.6931 0.7262 0.7262
0.0173 35.0 315 0.0052 0.7310 0.7015 0.7315 0.7319
0.0599 36.0 324 0.0058 0.7276 0.6972 0.7289 0.7291
0.0125 37.0 333 0.0044 0.7328 0.7057 0.7331 0.7332
0.0155 38.0 342 0.0054 0.7218 0.6882 0.7227 0.7234
0.0199 39.0 351 0.0050 0.7275 0.6965 0.7287 0.7292
0.0109 40.0 360 0.0035 0.7334 0.7064 0.7339 0.7347
0.0229 41.0 369 0.0034 0.7334 0.7064 0.7339 0.7347
0.0353 42.0 378 0.0033 0.7334 0.7064 0.7339 0.7347
0.0124 43.0 387 0.0035 0.7352 0.7084 0.7357 0.7354
0.0147 44.0 396 0.0033 0.7319 0.7036 0.7322 0.7327
0.0055 45.0 405 0.0032 0.7310 0.7026 0.7312 0.7320
0.0183 46.0 414 0.0031 0.7371 0.7114 0.7373 0.7377
0.004 47.0 423 0.0033 0.7342 0.7067 0.7344 0.7349
0.0195 48.0 432 0.0032 0.7311 0.7018 0.7318 0.7323
0.0112 49.0 441 0.0031 0.7371 0.7114 0.7373 0.7377
0.0186 50.0 450 0.0029 0.7371 0.7114 0.7373 0.7377
0.0043 51.0 459 0.0028 0.7371 0.7114 0.7373 0.7377
0.011 52.0 468 0.0023 0.7371 0.7114 0.7373 0.7377
0.0203 53.0 477 0.0021 0.7371 0.7114 0.7373 0.7377
0.0099 54.0 486 0.0021 0.7367 0.7113 0.7367 0.7376
0.0095 55.0 495 0.0021 0.7371 0.7114 0.7373 0.7377
0.021 56.0 504 0.0021 0.7371 0.7114 0.7373 0.7377
0.0191 57.0 513 0.0022 0.7371 0.7114 0.7373 0.7377
0.0033 58.0 522 0.0021 0.7371 0.7114 0.7373 0.7377
0.0264 59.0 531 0.0021 0.7371 0.7114 0.7373 0.7377
0.0034 60.0 540 0.0021 0.7371 0.7114 0.7373 0.7377

Framework versions

  • Transformers 4.46.2
  • Pytorch 2.5.0+cu121
  • Datasets 3.1.0
  • Tokenizers 0.20.3
Downloads last month
4
Safetensors
Model size
248M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for devagonal/flan-t5-rouge-durga-q5-clean-4f

Finetuned
(629)
this model