flan-t5-dialogue / README.md
Amalq's picture
Update README.md
58f9391
metadata
license: apache-2.0
tags:
  - generated_from_trainer
datasets:
  - shared_TaskA
metrics:
  - rouge
model-index:
  - name: flan-t5-base-dialogue
    results:
      - task:
          name: Sequence-to-sequence Language Modeling
          type: text2text-generation
        dataset:
          name: shared_TaskA
          type: shared_TaskA
          config: shared_TaskA
          split: train
          args: samsum
        metrics:
          - name: Rouge1
            type: rouge
            value: 28.1748

flan-t5-base-sharedTaskA

This model is a fine-tuned version of google/flan-t5-base on the shared_TaskA dataset. It achieves the following results on the evaluation set:

  • Loss: 2.5153
  • Rouge1: 28.1748
  • Rouge2: 14.384
  • Rougel: 27.6673
  • Rougelsum: 27.8465
  • Gen Len: 18.85

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 5

Training results

Training Loss Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len No log 2.554769 27.797100 14.471000 27.468300 27.617000 18.970000 No log 2.515381 28.174800 14.384000 27.667300 27.846500 18.850000 No log 2.542737 27.982600 14.754000 27.559000 27.834200 18.800000 1.809200 2.528819 28.010600 15.268300 27.816000 27.999000 18.690000 1.809200 2.534979 28.104800 15.248000 27.840400 28.069500 18.670000

Example Uses

from transformers import AutoTokenizer, AutoModelForSeq2SeqLM 
tokenizer_pre = AutoTokenizer.from_pretrained("Amalq/flan-t5-dialogue")
model_pre = AutoModelForSeq2SeqLM.from_pretrained("Amalq/flan-t5-dialogue")