Spaces:

clip-italian
/

clip-italian-demo

Running

srisweet commited on Jul 24, 2021

Commit

12ddc30

•

1 Parent(s): 705e8fa

Update introduction.md

Files changed (1) hide show

introduction.md CHANGED Viewed

@@ -1,10 +1,10 @@
-CLIP-Italian is a multimodal model trained on ~1.4 million Italian text-image pairs using Italian Bert model as text encoder and Vision Transformer(ViT) as image encoder using the JAX/Flax neural network library. The training was carried out during the Hugging Face Community event on Google's TPU machines, sponsored by Google Cloud.
 Clip-Italian (Contrastive Language-Image Pre-training in Italian language) is based on OpenAI’s CLIP ([Radford et al., 2021](https://arxiv.org/abs/2103.00020))which is an amazing model that can learn to represent images and text jointly in the same space.
 In this project, we aim to propose the first CLIP model trained on Italian data, that in this context can be considered a
-low resource language. Using a few techniques, we have been able to fine-tune a SOTA Italian CLIP model with **only 1.4 million** training samples. Our Italian CLIP model
 is built upon the pre-trained [Italian BERT](https://huggingface.co/dbmdz/bert-base-italian-xxl-cased) model provided by [dbmdz](https://huggingface.co/dbmdz) and the OpenAI
 [vision transformer](https://huggingface.co/openai/clip-vit-base-patch32).

+CLIP-Italian is a **multimodal** model trained on **~1.4 Million** Italian text-image pairs using **Italian Bert** model as text encoder and Vision Transformer **ViT** as image encoder using the **JAX/Flax** neural network library. The training was carried out during the **Hugging Face** Community event on **Google's TPU** machines, sponsored by **Google Cloud**.
 Clip-Italian (Contrastive Language-Image Pre-training in Italian language) is based on OpenAI’s CLIP ([Radford et al., 2021](https://arxiv.org/abs/2103.00020))which is an amazing model that can learn to represent images and text jointly in the same space.
 In this project, we aim to propose the first CLIP model trained on Italian data, that in this context can be considered a
+low resource language. Using a few techniques, we have been able to fine-tune a SOTA Italian CLIP model with **only 1.4M** training samples. Our Italian CLIP model
 is built upon the pre-trained [Italian BERT](https://huggingface.co/dbmdz/bert-base-italian-xxl-cased) model provided by [dbmdz](https://huggingface.co/dbmdz) and the OpenAI
 [vision transformer](https://huggingface.co/openai/clip-vit-base-patch32).