Edit model card


  
    Model: BERT-TWEET
    Lang: IT
  

Model description

This is a BERT [1] uncased model for the Italian language, obtained using TwHIN-BERT [2] (twhin-bert-base) as a starting point and focusing it on the Italian language by modifying the embedding layer (as in [3], computing document-level frequencies over the Wikipedia dataset)

The resulting model has 110M parameters, a vocabulary of 30.520 tokens, and a size of ~440 MB.

Quick usage

from transformers import BertTokenizerFast, BertModel

tokenizer = BertTokenizerFast.from_pretrained("osiria/bert-tweet-base-italian-uncased")
model = BertModel.from_pretrained("osiria/bert-tweet-base-italian-uncased")

Here you can find the find the model already fine-tuned on Sentiment Analysis: https://huggingface.co/osiria/bert-tweet-italian-uncased-sentiment

References

[1] https://arxiv.org/abs/1810.04805

[2] https://arxiv.org/abs/2209.07562

[3] https://arxiv.org/abs/2010.05609

Limitations

This model was trained on tweets, so it's mainly suitable for general-purpose social media text processing, involving short texts written in a social network style. It might show limitations when it comes to longer and more structured text, or domain-specific text.

License

The model is released under Apache-2.0 license

Downloads last month
5
Safetensors
Model size
110M params
Tensor type
I64
·
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.