bourdoiscatie
commited on
Commit
•
e4d74d3
1
Parent(s):
b3e1791
Update README.md
Browse filesUpdate distillation link
README.md
CHANGED
@@ -126,8 +126,7 @@ datasets:
|
|
126 |
|
127 |
## Model Description
|
128 |
|
129 |
-
This model is a distilled version of the [BERT base multilingual model](https://huggingface.co/bert-base-multilingual-cased/). The code for the distillation process can be found
|
130 |
-
[here](https://github.com/huggingface/transformers/tree/master/examples/distillation). This model is cased: it does make a difference between english and English.
|
131 |
|
132 |
The model is trained on the concatenation of Wikipedia in 104 different languages listed [here](https://github.com/google-research/bert/blob/master/multilingual.md#list-of-languages).
|
133 |
The model has 6 layers, 768 dimension and 12 heads, totalizing 134M parameters (compared to 177M parameters for mBERT-base).
|
|
|
126 |
|
127 |
## Model Description
|
128 |
|
129 |
+
This model is a distilled version of the [BERT base multilingual model](https://huggingface.co/bert-base-multilingual-cased/). The code for the distillation process can be found [here](https://github.com/huggingface/transformers/tree/main/examples/research_projects/distillation). This model is cased: it does make a difference between english and English.
|
|
|
130 |
|
131 |
The model is trained on the concatenation of Wikipedia in 104 different languages listed [here](https://github.com/google-research/bert/blob/master/multilingual.md#list-of-languages).
|
132 |
The model has 6 layers, 768 dimension and 12 heads, totalizing 134M parameters (compared to 177M parameters for mBERT-base).
|