QCRI
/

bert-base-multilingual-cased-pos-english

Token Classification

Inference Endpoints

Model card Files Files and versions Community

fdalvi commited on Jun 13, 2022

Commit

0b39bec

•

1 Parent(s): 530d225

Update README.md

Files changed (1) hide show

README.md +33 -1

README.md CHANGED Viewed

@@ -7,4 +7,36 @@ tags:
 license: cc-by-nc-3.0
 ---
-# BERT-base-multilingual-cased finetuned for Part-of-Speech tagging

 license: cc-by-nc-3.0
 ---
+# BERT-base-multilingual-cased finetuned for Part-of-Speech tagging
+This is a multilingual BERT model fine tuned for part-of-speech tagging for English. It is trained using the Penn TreeBank (Marcus et al., 1993) and achieves an F1-score of 96.69.
+## Usage
+A *transformers* pipeline can be used to run the model:
+```python
+from transformers import AutoTokenizer, AutoModelForTokenClassification, TokenClassificationPipeline
+model_name = "QCRI/bert-base-multilingual-cased-pos-english"
+tokenizer = AutoTokenizer.from_pretrained(model_name)
+model = AutoModelForTokenClassification.from_pretrained(model_name)
+pipeline = TokenClassificationPipeline(model, tokenizer)
+outputs = pipeline("A test example")
+print(outputs)
+```
+## Citation
+This model was used for all the part-of-speech tagging based results in *Analyzing Encoded Concepts in Transformer Language Models*, published at NAACL'22. If you find this model useful for your own work, please use the following citation:
+```bib
+@inproceedings{sajjad-NAACL,
+  title={Analyzing Encoded Concepts in Transformer Language Models},
+  author={Hassan Sajjad, Nadir Durrani, Fahim Dalvi, Firoj Alam, Abdul Rafae Khan and Jia Xu},
+  booktitle={North American Chapter of the Association of Computational Linguistics: Human Language Technologies (NAACL)},
+  series={NAACL~'22},
+  year={2022},
+  address={Seattle}
+}
+```