update
Browse files
README.md
CHANGED
@@ -23,11 +23,6 @@ inference: false
|
|
23 |
</p>
|
24 |
<br>
|
25 |
|
26 |
-
<p align="center">
|
27 |
-
<a href="README_CN.md">中文</a>  |   English
|
28 |
-
</p>
|
29 |
-
<br><br>
|
30 |
-
|
31 |
**Qwen-VL** (Qwen Large Vision Language Model) is the visual multimodal version of the large model series, Qwen (abbr. Tongyi Qianwen), proposed by Alibaba Cloud. Qwen-VL accepts image, text, and bounding box as inputs, outputs text and bounding box. The features of Qwen-VL include:
|
32 |
- **Strong performance**: It significantly surpasses existing open-source Large Vision Language Models (LVLM) under similar scale settings on multiple English evaluation benchmarks (including Zero-shot caption, VQA, DocVQA, and Grounding).
|
33 |
- **Multi-lingual LVLM support text recognization**: Qwen-VL naturally supports multi-lingual conversation, and it promotes end-to-end recognition of Chinese and English bi-lingual text in images.
|
|
|
23 |
</p>
|
24 |
<br>
|
25 |
|
|
|
|
|
|
|
|
|
|
|
26 |
**Qwen-VL** (Qwen Large Vision Language Model) is the visual multimodal version of the large model series, Qwen (abbr. Tongyi Qianwen), proposed by Alibaba Cloud. Qwen-VL accepts image, text, and bounding box as inputs, outputs text and bounding box. The features of Qwen-VL include:
|
27 |
- **Strong performance**: It significantly surpasses existing open-source Large Vision Language Models (LVLM) under similar scale settings on multiple English evaluation benchmarks (including Zero-shot caption, VQA, DocVQA, and Grounding).
|
28 |
- **Multi-lingual LVLM support text recognization**: Qwen-VL naturally supports multi-lingual conversation, and it promotes end-to-end recognition of Chinese and English bi-lingual text in images.
|