leia-llm
/

Leia-Swallow-13b

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

ikuyamada commited on Apr 18

Commit

f48b8b1

•

1 Parent(s): 49959e8

Create README.md

Files changed (1) hide show

README.md +42 -0

README.md ADDED Viewed

	@@ -0,0 +1,42 @@

+---
+license: apache-2.0
+language:
+- ja
+---
+# Leia-Swallow-13B
+LEIA is a training technique for autoregressive LLMs that effectively improves their performance in languages other than English by enhancing cross-lingual knowledge transfer from English to a target language.
+This model is constructed by applying LEIA to Swallow, a Japanese-English bilingual LLM based on LLaMA 2.
+The model achieves enhanced performance on four out of six Japanese question answering benchmarks and equivalent performance on the remaining two, as reported below.
+Please refer to our paper or blog post (in Japanese) for further technical details.
+- [LEIA: Facilitating Cross-Lingual Knowledge Transfer in Language Models with Entity-based Data Augmentation](https://arxiv.org/abs/2402.11485) (arxiv.org)
+- [LEIA: 言語間転移学習でLLMを賢くする新しい方法](#) (zenn.dev)
+## Model List
+- [Leia-Swallow-7b](https://huggingface.co/leia-llm/Leia-Swallow-7b/)
+- [Leia-Swallow-13b](https://huggingface.co/leia-llm/Leia-Swallow-13b/)
+## Empirical Results
+The model is assessed using the following six question answering benchmarks:
+- X-CODAH
+- X-CSQA
+- JCommonsenseQA
+- NIILC
+- JEMHopQA
+- JAQKET v2
+| Model | X-CODAH | X-CSQA | JCommonsenseQA | NIILC | JEMHopQA | JAQKET v2 |
+| ---- | ---- | ---- | ---- | ---- | ---- | ---- |
+| Swallow |　43.3　| 41.8 | 89.3 | 64.1 | 50.6 | 88.9 |
+| LEIA | **44.0** | **41.9** | 89.3 | **65.8** | **50.6** | **89.6** |
+For further details of this experiment, please refer to [our paper](https://arxiv.org/abs/2402.11485).
+## Contributors
+- Ikuya Yamada (Studio Ousia, RIKEN)
+- Ryokan Ri (LY Corporation, SB Intuitions)