mesolitica/mallam-1.1B-4096
Text Generation
•
Updated
•
508
•
5
Pretrain from scratch 4096 context length on 90B tokens Malaysian text, https://huggingface.co/papers/2401.14680