GQA: Training Generalized Multi-Query Transformer Models from Multi-Head Checkpoints Paper • 2305.13245 • Published May 22, 2023 • 5
Reka Core, Flash, and Edge: A Series of Powerful Multimodal Language Models Paper • 2404.12387 • Published Apr 18 • 38
OpenELM: An Efficient Language Model Family with Open-source Training and Inference Framework Paper • 2404.14619 • Published Apr 22 • 124