danangwijaya
/

IndoRetNet-Liputan6

Text Generation

Generated from Trainer

Inference Endpoints

Model card Files Files and versions Community

danangwijaya commited on Jan 21

Commit

327d233

•

1 Parent(s): e908a76

Update README.md

Files changed (1) hide show

README.md +12 -5

README.md CHANGED Viewed

@@ -6,6 +6,9 @@ datasets:
 model-index:
 - name: IndoRetNet-Liputan6
   results: []
 ---
 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
@@ -13,21 +16,25 @@ should probably proofread and complete it, then remove this comment. -->
 # IndoRetNet-Liputan6
-This model is a fine-tuned version of [](https://huggingface.co/) on the liputan6 dataset.
 It achieves the following results on the evaluation set:
 - Loss: 3.4936
 ## Model description
-More information needed
 ## Intended uses & limitations
-More information needed
 ## Training and evaluation data
-More information needed
 ## Training procedure
@@ -74,4 +81,4 @@ The following hyperparameters were used during training:
 - Transformers 4.36.2
 - Pytorch 2.1.0+cu121
 - Datasets 2.16.1
-- Tokenizers 0.15.0

 model-index:
 - name: IndoRetNet-Liputan6
   results: []
+license: apache-2.0
+language:
+- id
 ---
 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 # IndoRetNet-Liputan6
+This model is a Indonesian RetNet model train using the Liputan6 dataset.
+Using Tokenizer from [IndoBERT](https://huggingface.co/indolem/indobert-base-uncased)
 It achieves the following results on the evaluation set:
 - Loss: 3.4936
 ## Model description
+Demonstrate training and recurrent inference using a retentive network (https://arxiv.org/pdf/2307.08621.pdf).
+The code utilizes Sehyun Choi's implementation of retentive network (https://github.com/syncdoth/RetNet).
+- **License:** Apache 2.0.
 ## Intended uses & limitations
+Intended to demonstrate training and (recurrent O(1)) inference using a retentive network in Indonesian language.
 ## Training and evaluation data
+Using Train and validation set from Liputan6 dataset provided by [NusaCrowd](https://github.com/IndoNLP/nusa-crowd).
 ## Training procedure
 - Transformers 4.36.2
 - Pytorch 2.1.0+cu121
 - Datasets 2.16.1
+- Tokenizers 0.15.0