--- datasets: - conll2012_ontonotesv5 language: - en pipeline_tag: text2text-generation --- Given a text, its output format is: `"{ENT_TYPE}:{span}; {ENT_TYPE}:{span}..."`\ For training speed, we only use the first 10,000 sentences (not documents) from train set; 1,000 sentences from validation set;\ we save the model when its val_loss (NLL) reaches the minimum.\ The model could be used as a pretrained backbone on downstream fine-tuning NER tasks.