Commit
•
ec97e87
1
Parent(s):
24688b3
Update README.md
Browse files
README.md
CHANGED
@@ -1,7 +1,21 @@
|
|
1 |
---
|
2 |
language:
|
3 |
- multilingual
|
4 |
-
- en
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
5 |
tags:
|
6 |
- zero-shot-classification
|
7 |
- text-classification
|
@@ -10,8 +24,8 @@ tags:
|
|
10 |
metrics:
|
11 |
- accuracy
|
12 |
datasets:
|
13 |
-
- mnli
|
14 |
- xnli
|
|
|
15 |
pipeline_tag: zero-shot-classification
|
16 |
widget:
|
17 |
- text: "Angela Merkel ist eine Politikerin in Deutschland und Vorsitzende der CDU"
|
@@ -42,7 +56,7 @@ print(prediction)
|
|
42 |
```
|
43 |
|
44 |
### Training data
|
45 |
-
This model was trained on the development
|
46 |
|
47 |
### Training procedure
|
48 |
DeBERTa-v3-base-mnli was trained using the Hugging Face trainer with the following hyperparameters.
|
@@ -57,13 +71,12 @@ training_args = TrainingArguments(
|
|
57 |
)
|
58 |
```
|
59 |
### Eval results
|
60 |
-
The model was evaluated
|
61 |
|
62 |
average | ar | bg | de | el | en | es | fr | hi | ru | sw | th | tr | ur | vu | zh
|
63 |
---------|----------|---------|----------|----------|----------|----------|----------|----------|----------|----------|----------|----------|----------|----------|----------
|
64 |
0.808 | 0.802 | 0.829 | 0.825 | 0.826 | 0.883 | 0.845 | 0.834 | 0.771 | 0.813 | 0.748 | 0.793 | 0.807 | 0.740 | 0.795 | 0.8116
|
65 |
|
66 |
-
{'ar': 0.8017964071856287, 'bg': 0.8287425149700599, 'de': 0.8253493013972056, 'el': 0.8267465069860279, 'en': 0.8830339321357286, 'es': 0.8449101796407186, 'fr': 0.8343313373253493, 'hi': 0.7712574850299401, 'ru': 0.8127744510978044, 'sw': 0.7483033932135729, 'th': 0.792814371257485, 'tr': 0.8065868263473054, 'ur': 0.7403193612774451, 'vi': 0.7954091816367266, 'zh': 0.8115768463073852}
|
67 |
|
68 |
## Limitations and bias
|
69 |
Please consult the original DeBERTa-V3 paper and literature on different NLI datasets for potential biases.
|
|
|
1 |
---
|
2 |
language:
|
3 |
- multilingual
|
4 |
+
- en
|
5 |
+
- ar
|
6 |
+
- bg
|
7 |
+
- de
|
8 |
+
- el
|
9 |
+
- es
|
10 |
+
- fr
|
11 |
+
- hi
|
12 |
+
- ru
|
13 |
+
- sw
|
14 |
+
- th
|
15 |
+
- tr
|
16 |
+
- ur
|
17 |
+
- vu
|
18 |
+
- zh
|
19 |
tags:
|
20 |
- zero-shot-classification
|
21 |
- text-classification
|
|
|
24 |
metrics:
|
25 |
- accuracy
|
26 |
datasets:
|
|
|
27 |
- xnli
|
28 |
+
- mnli
|
29 |
pipeline_tag: zero-shot-classification
|
30 |
widget:
|
31 |
- text: "Angela Merkel ist eine Politikerin in Deutschland und Vorsitzende der CDU"
|
|
|
56 |
```
|
57 |
|
58 |
### Training data
|
59 |
+
This model was trained on the XNLI development dataset and the MNLI train dataset. The XNLI development set consists of 5010 professionally translated texts for each of 15 languages (see [this paper](https://arxiv.org/pdf/1809.05053.pdf)). Note that the XNLI contains a training set of 15 machine translated versions of the MNLI dataset for 15 languages, but due to quality issues with these machine translations, this model was only trained on the professional translations from the XNLI development set and the original English MNLI training set (392 702 texts). Not using machine translated texts can avoid overfitting the model to the 15 languages and avoid catastrophic forgetting of the other 85 languages mDeBERTa was pre-trained on.
|
60 |
|
61 |
### Training procedure
|
62 |
DeBERTa-v3-base-mnli was trained using the Hugging Face trainer with the following hyperparameters.
|
|
|
71 |
)
|
72 |
```
|
73 |
### Eval results
|
74 |
+
The model was evaluated on the XNLI test set. Note that if other multilingual models on the model hub claim performance of around 90% on languages other than English, the authors have most likely made a mistake during testing since non of the latest papers shows a multilingual average performance of more than a few points above 80% on XNLI (see [here](https://arxiv.org/pdf/2111.09543.pdf) or [here](https://arxiv.org/pdf/1911.02116.pdf).
|
75 |
|
76 |
average | ar | bg | de | el | en | es | fr | hi | ru | sw | th | tr | ur | vu | zh
|
77 |
---------|----------|---------|----------|----------|----------|----------|----------|----------|----------|----------|----------|----------|----------|----------|----------
|
78 |
0.808 | 0.802 | 0.829 | 0.825 | 0.826 | 0.883 | 0.845 | 0.834 | 0.771 | 0.813 | 0.748 | 0.793 | 0.807 | 0.740 | 0.795 | 0.8116
|
79 |
|
|
|
80 |
|
81 |
## Limitations and bias
|
82 |
Please consult the original DeBERTa-V3 paper and literature on different NLI datasets for potential biases.
|