metadata

license: mit
library_name: peft
tags:
  - generated_from_trainer
base_model: openai-community/gpt2
model-index:
  - name: >-
      GPT2_Pirate_2024_05_10_20_25_24_lora_weightTrue_loraR32_optim_adamw_torch_epoch2_lr3e-05
    results: []

GPT2_Pirate_2024_05_10_20_25_24_lora_weightTrue_loraR32_optim_adamw_torch_epoch2_lr3e-05

This model is a fine-tuned version of openai-community/gpt2 on the None dataset. It achieves the following results on the evaluation set:

Loss: 1.7723

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 3e-05
train_batch_size: 8
eval_batch_size: 8
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 2
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss
2.3874	0.0673	1000	1.9785
1.7367	0.1346	2000	1.8709
1.6208	0.2020	3000	1.8520
1.5521	0.2693	4000	1.8361
1.5165	0.3366	5000	1.8298
1.484	0.4039	6000	1.8227
1.4262	0.4712	7000	1.8078
1.4701	0.5385	8000	1.8000
1.4188	0.6059	9000	1.7938
1.403	0.6732	10000	1.7908
1.4115	0.7405	11000	1.7940
1.4153	0.8078	12000	1.7888
1.3903	0.8751	13000	1.7844
1.3918	0.9424	14000	1.7830
1.4018	1.0098	15000	1.7843
1.3579	1.0771	16000	1.7777
1.3803	1.1444	17000	1.7776
1.3545	1.2117	18000	1.7778
1.3557	1.2790	19000	1.7742
1.3739	1.3463	20000	1.7769
1.3538	1.4137	21000	1.7778
1.3761	1.4810	22000	1.7763
1.347	1.5483	23000	1.7735
1.3579	1.6156	24000	1.7729
1.3581	1.6829	25000	1.7734
1.3472	1.7503	26000	1.7726
1.3377	1.8176	27000	1.7752
1.3243	1.8849	28000	1.7732
1.369	1.9522	29000	1.7723

Framework versions

PEFT 0.10.0
Transformers 4.40.2
Pytorch 2.3.0+cu121
Datasets 2.19.1
Tokenizers 0.19.1