razhan's picture
Model save
011ffe8
|
raw
history blame
2 kB
metadata
license: apache-2.0
base_model: facebook/bart-base
tags:
  - generated_from_trainer
metrics:
  - wer
  - bleu
model-index:
  - name: bart-kurd-spell-base-sn
    results: []

bart-kurd-spell-base-sn

This model is a fine-tuned version of facebook/bart-base on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.4815
  • Cer: 2.1669
  • Wer: 12.1294
  • Bleu: 78.2542
  • Chrf: 95.7354
  • Gen Len: 16.7779

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 64
  • eval_batch_size: 64
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 5.0
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Cer Wer Bleu Chrf Gen Len
0.2972 1.0 177897 0.5591 2.836 14.9407 73.3024 94.1403 16.7054
0.233 2.0 355794 0.5157 2.4613 13.4362 75.8819 95.0077 16.7604
0.2043 3.0 533691 0.4918 2.307 12.7609 77.0962 95.3849 16.7681
0.1753 4.0 711588 0.4871 2.2105 12.3386 77.928 95.6297 16.7765
0.1655 5.0 889485 0.4815 2.1669 12.1294 78.2542 95.7354 16.7779

Framework versions

  • Transformers 4.36.0.dev0
  • Pytorch 2.1.0
  • Datasets 2.15.0
  • Tokenizers 0.15.0