slone
/

nllb-pruned-6L-512d-finetuned

text2text-generation

Inference Endpoints

Model card Files Files and versions Community

cointegrated commited on Nov 23, 2023

Commit

09ac636

•

1 Parent(s): d65d330

Update README.md

Files changed (1) hide show

README.md +2 -1

README.md CHANGED Viewed

@@ -188,4 +188,5 @@ This model was fine-tuned on the [slone/nllb-200-10M-sample](https://huggingface
 the [NLLB dataset](https://huggingface.co/datasets/allenai/nllb) with 175 languages, using only the samples with BLASER score above 3.5.
 Because of its small size, it is really bad at translation, but can serve as a base model for further fine-tuning for a small number of languages.
-It is recommended to prune the vocabulary of this model before fine-tuning, to preserve only the tokens used with the intended languages.

 the [NLLB dataset](https://huggingface.co/datasets/allenai/nllb) with 175 languages, using only the samples with BLASER score above 3.5.
 Because of its small size, it is really bad at translation, but can serve as a base model for further fine-tuning for a small number of languages.
+It is recommended to [prune the vocabulary of this model](https://towardsdatascience.com/how-to-adapt-a-multilingual-t5-model-for-a-single-language-b9f94f3d9c90)
+before fine-tuning, to preserve only the tokens used with the intended languages.