Edit model card

PoliteT5Small

This model is a fine-tuned version of google/flan-t5-small on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 1.3505
  • Toxicity Ratio: 0.3158

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.01
  • train_batch_size: 64
  • eval_batch_size: 64
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 75

Training results

Training Loss Epoch Step Validation Loss Toxicity Ratio
No log 1.0 22 0.6642 0.3158
No log 2.0 44 0.6347 0.3158
0.9343 3.0 66 0.6623 0.3158
0.9343 4.0 88 0.6737 0.3070
0.3783 5.0 110 0.7201 0.2982
0.3783 6.0 132 0.7606 0.3596
0.2536 7.0 154 0.7567 0.2807
0.2536 8.0 176 0.8618 0.3070
0.2536 9.0 198 0.8444 0.3158
0.1839 10.0 220 0.8257 0.3333
0.1839 11.0 242 0.8643 0.3158
0.1246 12.0 264 0.8334 0.3421
0.1246 13.0 286 0.8895 0.3246
0.1042 14.0 308 0.9631 0.2982
0.1042 15.0 330 0.9004 0.3070
0.0929 16.0 352 0.8878 0.2982
0.0929 17.0 374 0.9009 0.2982
0.0929 18.0 396 0.9762 0.3158
0.0745 19.0 418 0.9296 0.2982
0.0745 20.0 440 0.9429 0.3246
0.0668 21.0 462 0.9779 0.3158
0.0668 22.0 484 0.9731 0.2982
0.0494 23.0 506 0.9640 0.3158
0.0494 24.0 528 0.9984 0.2982
0.0425 25.0 550 0.9966 0.3070
0.0425 26.0 572 0.9861 0.3246
0.0425 27.0 594 1.0335 0.3333
0.0432 28.0 616 1.0358 0.2982
0.0432 29.0 638 1.0244 0.3158
0.0328 30.0 660 1.0050 0.3158
0.0328 31.0 682 0.9838 0.2982
0.0277 32.0 704 1.0576 0.3158
0.0277 33.0 726 1.0719 0.3070
0.0277 34.0 748 1.0851 0.3246
0.0194 35.0 770 0.9992 0.3246
0.0194 36.0 792 1.1454 0.3333
0.0145 37.0 814 1.1179 0.3158
0.0145 38.0 836 1.0586 0.3158
0.0157 39.0 858 1.0638 0.3333
0.0157 40.0 880 1.1544 0.3333
0.0114 41.0 902 1.1529 0.2895
0.0114 42.0 924 1.2017 0.3246
0.0114 43.0 946 1.0783 0.3333
0.0096 44.0 968 1.1984 0.3333
0.0096 45.0 990 1.1839 0.3158
0.0094 46.0 1012 1.1178 0.3246
0.0094 47.0 1034 1.2424 0.3070
0.0065 48.0 1056 1.1740 0.3158
0.0065 49.0 1078 0.9860 0.3070
0.0081 50.0 1100 1.2554 0.3333
0.0081 51.0 1122 1.2024 0.2895
0.0081 52.0 1144 1.2440 0.2807
0.0035 53.0 1166 1.2392 0.3070
0.0035 54.0 1188 1.3189 0.3070
0.0033 55.0 1210 1.2635 0.2895
0.0033 56.0 1232 1.2367 0.2982
0.0033 57.0 1254 1.2691 0.3070
0.0033 58.0 1276 1.2762 0.3070
0.0033 59.0 1298 1.2492 0.2982
0.0021 60.0 1320 1.2530 0.3070
0.0021 61.0 1342 1.2754 0.3158
0.002 62.0 1364 1.3817 0.3070
0.002 63.0 1386 1.3887 0.3158
0.0016 64.0 1408 1.3172 0.3246
0.0016 65.0 1430 1.3481 0.3158
0.0023 66.0 1452 1.3109 0.3246
0.0023 67.0 1474 1.2907 0.3246
0.0023 68.0 1496 1.2926 0.3246
0.0014 69.0 1518 1.3122 0.3158
0.0014 70.0 1540 1.3354 0.3158
0.0008 71.0 1562 1.3440 0.3158
0.0008 72.0 1584 1.3367 0.3158
0.0011 73.0 1606 1.3452 0.3158
0.0011 74.0 1628 1.3514 0.3158
0.0011 75.0 1650 1.3505 0.3158

Framework versions

  • Transformers 4.28.0
  • Pytorch 2.0.0
  • Datasets 2.11.0
  • Tokenizers 0.13.3
Downloads last month
5
Safetensors
Model size
77M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.