GPT2_Pirate / README.md
Nielzac's picture
End of training
8b9df33 verified
metadata
license: mit
library_name: peft
tags:
  - generated_from_trainer
base_model: openai-community/gpt2
model-index:
  - name: >-
      GPT2_Pirate_2024_05_10_20_25_24_lora_weightTrue_loraR32_optim_adamw_torch_epoch2_lr3e-05
    results: []

GPT2_Pirate_2024_05_10_20_25_24_lora_weightTrue_loraR32_optim_adamw_torch_epoch2_lr3e-05

This model is a fine-tuned version of openai-community/gpt2 on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 1.7723

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 3e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 2
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss
2.3874 0.0673 1000 1.9785
1.7367 0.1346 2000 1.8709
1.6208 0.2020 3000 1.8520
1.5521 0.2693 4000 1.8361
1.5165 0.3366 5000 1.8298
1.484 0.4039 6000 1.8227
1.4262 0.4712 7000 1.8078
1.4701 0.5385 8000 1.8000
1.4188 0.6059 9000 1.7938
1.403 0.6732 10000 1.7908
1.4115 0.7405 11000 1.7940
1.4153 0.8078 12000 1.7888
1.3903 0.8751 13000 1.7844
1.3918 0.9424 14000 1.7830
1.4018 1.0098 15000 1.7843
1.3579 1.0771 16000 1.7777
1.3803 1.1444 17000 1.7776
1.3545 1.2117 18000 1.7778
1.3557 1.2790 19000 1.7742
1.3739 1.3463 20000 1.7769
1.3538 1.4137 21000 1.7778
1.3761 1.4810 22000 1.7763
1.347 1.5483 23000 1.7735
1.3579 1.6156 24000 1.7729
1.3581 1.6829 25000 1.7734
1.3472 1.7503 26000 1.7726
1.3377 1.8176 27000 1.7752
1.3243 1.8849 28000 1.7732
1.369 1.9522 29000 1.7723

Framework versions

  • PEFT 0.10.0
  • Transformers 4.40.2
  • Pytorch 2.3.0+cu121
  • Datasets 2.19.1
  • Tokenizers 0.19.1