--- library_name: transformers tags: [] --- XLM Roberta Tokenizer trained with 162M tokens of Khmer text. ```python from transformers import AutoTokenizer tokenizer = AutoTokenizer.from_pretrained("seanghay/xlm-roberta-khmer-32k-tokenizer") tokenizer.tokenize("សួស្ដីកម្ពុជា!") ```