Granite Code Models Collection A series of code models trained by IBM licensed under Apache 2.0 license. We release both the base pretrained and instruct models. • 23 items • Updated 8 days ago • 177
Meta Llama 3 Collection This collection hosts the transformers and original repos of the Meta Llama 3 and Llama Guard 2 releases • 5 items • Updated Sep 25 • 682
BitNet: Scaling 1-bit Transformers for Large Language Models Paper • 2310.11453 • Published Oct 17, 2023 • 96
Mamba: Linear-Time Sequence Modeling with Selective State Spaces Paper • 2312.00752 • Published Dec 1, 2023 • 138
Distil-Whisper: Robust Knowledge Distillation via Large-Scale Pseudo Labelling Paper • 2311.00430 • Published Nov 1, 2023 • 57
SDXL: Improving Latent Diffusion Models for High-Resolution Image Synthesis Paper • 2307.01952 • Published Jul 4, 2023 • 82