Loubna Ben Allal's picture

Loubna Ben Allal

loubnabnl

·

https://loubnabnl.github.io/

AI & ML interests

LLMs, ML for code, Synthetic data

Articles

SmolLM - blazingly fast and remarkably powerful

CodeGemma - an official Google release for code LLMs

Cosmopedia: how to create large-scale synthetic data for pre-training Large Language Models

StarCoder2 and The Stack v2

Code Llama: Llama 2 learns to code

StarCoder: A State-of-the-Art LLM for Code

How to train a Language Model with Megatron-LM

Organizations

loubnabnl's activity

upvoted an article 3 months ago

Article

The 5 Most Under-Rated Tools on Hugging Face

Aug 22

• 85

upvoted a collection 3 months ago

💻 Local SmolLMs

SmolLM models in MLC, ONNX and GGUF format for local applications + in-browser demos • 14 items • Updated Aug 20 • 44

upvoted an article 4 months ago

Article

SmolLM - blazingly fast and remarkably powerful

Jul 16

• 258

upvoted 2 papers 5 months ago

The FineWeb Datasets: Decanting the Web for the Finest Text Data at Scale

Paper • 2406.17557 • Published Jun 25 • 86

Scaling Laws and Compute-Optimal Training Beyond Fixed Training Durations

Paper • 2405.18392 • Published May 28 • 12

upvoted 2 collections 8 months ago

Leaderboards and benchmarks ✨

Cool leaderboard spaces collection for models across modalities! Text, vision, audio, ... • 73 items • Updated 5 days ago • 88

ZeroGPU Spaces

ZeroGPU Spaces made by the community • 17 items • Updated Jun 6 • 229

upvoted a paper 8 months ago

StarCoder 2 and The Stack v2: The Next Generation

Paper • 2402.19173 • Published Feb 29 • 134

upvoted a collection 9 months ago

💫 StarCoder2

StarCoder2 models and datasets! • 8 items • Updated Mar 1 • 81

upvoted a paper about 1 year ago

Zephyr: Direct Distillation of LM Alignment

Paper • 2310.16944 • Published Oct 25, 2023 • 121