File size: 1,038 Bytes
2192952 8e04ebc 2192952 6e818cb 0b0e209 2192952 b95ef78 2192952 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 |
---
license: cc-by-nc-sa-4.0
---
# LayoutXLM
**Multimodal (text + layout/format + image) pre-training for document AI**
LayoutXLM is a multilingual variant of LayoutLMv2.
The documentation of this model in the Transformers library can be found [here](https://huggingface.co/docs/transformers/model_doc/layoutxlm).
[Microsoft Document AI](https://www.microsoft.com/en-us/research/project/document-ai/) | [GitHub](https://github.com/microsoft/unilm/tree/master/layoutxlm)
## Introduction
LayoutXLM is a multimodal pre-trained model for multilingual document understanding, which aims to bridge the language barriers for visually-rich document understanding. Experiment results show that it has significantly outperformed the existing SOTA cross-lingual pre-trained models on the XFUND dataset.
[LayoutXLM: Multimodal Pre-training for Multilingual Visually-rich Document Understanding](https://arxiv.org/abs/2104.08836)
Yiheng Xu, Tengchao Lv, Lei Cui, Guoxin Wang, Yijuan Lu, Dinei Florencio, Cha Zhang, Furu Wei, arXiv Preprint 2021 |