Create README.md
Browse files
README.md
ADDED
@@ -0,0 +1,42 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
license: apache-2.0
|
3 |
+
language:
|
4 |
+
- ja
|
5 |
+
---
|
6 |
+
# Leia-Swallow-13B
|
7 |
+
|
8 |
+
LEIA is a training technique for autoregressive LLMs that effectively improves their performance in languages other than English by enhancing cross-lingual knowledge transfer from English to a target language.
|
9 |
+
This model is constructed by applying LEIA to Swallow, a Japanese-English bilingual LLM based on LLaMA 2.
|
10 |
+
The model achieves enhanced performance on four out of six Japanese question answering benchmarks and equivalent performance on the remaining two, as reported below.
|
11 |
+
|
12 |
+
Please refer to our paper or blog post (in Japanese) for further technical details.
|
13 |
+
|
14 |
+
- [LEIA: Facilitating Cross-Lingual Knowledge Transfer in Language Models with Entity-based Data Augmentation](https://arxiv.org/abs/2402.11485) (arxiv.org)
|
15 |
+
- [LEIA: 言語間転移学習でLLMを賢くする新しい方法](#) (zenn.dev)
|
16 |
+
|
17 |
+
## Model List
|
18 |
+
|
19 |
+
- [Leia-Swallow-7b](https://huggingface.co/leia-llm/Leia-Swallow-7b/)
|
20 |
+
- [Leia-Swallow-13b](https://huggingface.co/leia-llm/Leia-Swallow-13b/)
|
21 |
+
|
22 |
+
## Empirical Results
|
23 |
+
|
24 |
+
The model is assessed using the following six question answering benchmarks:
|
25 |
+
- X-CODAH
|
26 |
+
- X-CSQA
|
27 |
+
- JCommonsenseQA
|
28 |
+
- NIILC
|
29 |
+
- JEMHopQA
|
30 |
+
- JAQKET v2
|
31 |
+
|
32 |
+
| Model | X-CODAH | X-CSQA | JCommonsenseQA | NIILC | JEMHopQA | JAQKET v2 |
|
33 |
+
| ---- | ---- | ---- | ---- | ---- | ---- | ---- |
|
34 |
+
| Swallow | 43.3 | 41.8 | 89.3 | 64.1 | 50.6 | 88.9 |
|
35 |
+
| LEIA | **44.0** | **41.9** | 89.3 | **65.8** | **50.6** | **89.6** |
|
36 |
+
|
37 |
+
For further details of this experiment, please refer to [our paper](https://arxiv.org/abs/2402.11485).
|
38 |
+
|
39 |
+
## Contributors
|
40 |
+
|
41 |
+
- Ikuya Yamada (Studio Ousia, RIKEN)
|
42 |
+
- Ryokan Ri (LY Corporation, SB Intuitions)
|