mrm8488 (Manuel Romero)

upvoted an article 2 days ago

Article

Fine-tuning LLMs to 1.58bit: extreme quantization made easy

2 days ago

• 104

upvoted a collection 13 days ago

WebInstruct 🌐 Embeddings 🧱 Models

Collection

A collection of SoTA embeddings model fine-tuned on WebInstruct dataset to learn to pair instructions with its responses • 3 items • Updated 15 days ago • 11

upvoted a collection 14 days ago

LLaVA-OneVision

Collection

a model good at arbitrary types of visual input • 15 items • Updated 8 days ago • 18

upvoted an article 14 days ago

Article

Serverless Inference with Hugging Face and NVIDIA NIMs

Jul 29

• 26

upvoted a collection 20 days ago

embeddings-spanish-models 🎯

Collection

A collection with embeddings models I fine-tuned for better performance in Spanish texts. • 3 items • Updated 21 days ago • 2

upvoted an article 27 days ago

Article

Llama-3.1-Storm-8B: Improved SLM with Self-Curation + Model Merging

By

•

Aug 19

• 72

upvoted 2 articles 28 days ago

Article

The 5 Most Under-Rated Tools on Hugging Face

29 days ago

• 74

Article

Fine-tune Llama 3.1 Ultra-Efficiently with Unsloth

By

•

Jul 29

• 193

upvoted an article about 1 month ago

Article

Introduction to Quantization cooked in 🤗 with 💗🧑‍🍳

By

•

Aug 25, 2023

• 17

upvoted a collection about 1 month ago

💻 Local SmolLMs

Collection

SmolLM models in MLC, ONNX and GGUF format for local applications + in-browser demos • 14 items • Updated about 1 month ago • 40

upvoted 2 articles about 2 months ago

Article

Google releases Gemma 2 2B, ShieldGemma and Gemma Scope

Jul 31

• 58

Article

WWDC 24: Running Mistral 7B with Core ML

Jul 22

• 54

upvoted 2 collections 2 months ago

DCLM

Collection

DCLM Models + Datasets • 7 items • Updated Jul 22 • 38

LLaVa-Interleave

Collection

LLaVa models that extends the model capabilities to Multi-image, Multi-frame (videos), Multi-patch (single-image) scenarios. • 3 items • Updated Jul 10 • 14

upvoted 2 articles 2 months ago

Article

SmolLM - blazingly fast and remarkably powerful

Jul 16

• 242

Article

Experimenting with Automatic PII Detection on the Hub using Presidio

Jul 10

• 23

upvoted a collection 2 months ago

H2O Danube3

Collection

6 items • Updated Jul 16 • 51

upvoted 2 articles 2 months ago

Article

The Rise of Agentic Data Generation

By

•

Jul 15

• 74

Article

In-browser LLM app in pure Python: Gemini Nano + Gradio-Lite

By

•

Jul 12

• 8

upvoted a paper 2 months ago

AgentInstruct: Toward Generative Teaching with Agentic Flows

Paper • 2407.03502 • Published Jul 3 • 43

upvoted an article 2 months ago

Article

Preference Optimization for Vision Language Models

Jul 10

• 36

upvoted a paper 2 months ago

MobileLLM: Optimizing Sub-billion Parameter Language Models for On-Device Use Cases

Paper • 2402.14905 • Published Feb 22 • 107

upvoted a paper 3 months ago

Scaling Synthetic Data Creation with 1,000,000,000 Personas

Paper • 2406.20094 • Published Jun 28 • 93

upvoted a collection 3 months ago

LLM Compiler

Collection

Meta LLM Compiler is a state-of-the-art LLM that builds upon Code Llama with improved performance for code optimization and compiler reasoning. • 4 items • Updated Jun 27 • 147

upvoted an article 3 months ago

Article

Going multimodal: How Prezi is leveraging the Hub and the Expert Support Program to accelerate their ML roadmap

Jun 19

• 11

upvoted 2 collections 3 months ago

Cambrian Data

Collection

3 items • Updated Jun 25 • 8

Embedding Model Datasets

Collection

A curated subset of the datasets that work out of the box with Sentence Transformers: https://huggingface.co/datasets?other=sentence-transformers • 67 items • Updated Jul 3 • 61

upvoted a paper 3 months ago

Instruction Pre-Training: Language Models are Supervised Multitask Learners

Paper • 2406.14491 • Published Jun 20 • 85

upvoted 2 collections 3 months ago

Florence

Collection

9 items • Updated Jul 11 • 153

FP8 LLMs for vLLM

Collection

Accurate FP8 quantized models by Neural Magic, ready for use with vLLM! • 37 items • Updated 24 days ago • 51

upvoted 4 papers 3 months ago

upvoted a collection 3 months ago

Magpie-Pro Datasets (Llama-3)

Collection

Dataset built with Meta Llama 3 70B. Models are fine-tuned from Llama 3 8B. • 6 items • Updated about 3 hours ago • 16

upvoted an article 3 months ago

Article

Putting RL back in RLHF

Jun 12

• 58

upvoted 2 papers 3 months ago

Show, Don't Tell: Aligning Language Models with Demonstrated Feedback

Paper • 2406.00888 • Published Jun 2 • 30

MMLU-Pro: A More Robust and Challenging Multi-Task Language Understanding Benchmark

Paper • 2406.01574 • Published Jun 3 • 42

upvoted an article 4 months ago

Article

Uncensor any LLM with abliteration

By

•

Jun 13

• 312

upvoted a collection 4 months ago

sentence-transformers-from-synthetic-data

Collection

Example of using distilabel to generate synthetic triplets data for fine-tuning a Sentence Transformer model • 4 items • Updated Jun 21 • 21

upvoted 4 articles 4 months ago

Article

Synthetic dataset generation techniques: generating custom sentence similarity data

By

•

May 23

• 14

Article

Train custom AI models with the trainer API and adapt them to 🤗

By

•

Jun 29

• 33

Article

PaliGemma – Google's Cutting-Edge Open Vision Language Model

May 14

• 194

Article

How to Finetune phi-3 on MacBook Pro

By

•

Apr 24

• 62

upvoted a paper 5 months ago

WildChat: 1M ChatGPT Interaction Logs in the Wild

Paper • 2405.01470 • Published May 2 • 59

upvoted an article 5 months ago

Article

GaLore: Advancing Large Model Training on Consumer-grade Hardware

Mar 20

• 24

upvoted 2 collections 5 months ago

Model Merging

Collection

Model Merging is a very popular technique nowadays in LLM. Here is a chronological list of papers on the space that will help you get started with it! • 30 items • Updated Jun 12 • 211

Performance LLMs - Base Models

Collection

22 items • Updated Apr 26 • 7

upvoted an article 5 months ago

Article

seemore: Implement a Vision Language Model from Scratch

By

•

Jun 23

• 56

upvoted a paper 5 months ago

Sailor: Open Language Models for South-East Asia

Paper • 2404.03608 • Published Apr 4 • 20

upvoted an article 5 months ago

Article

Welcome Llama 3 - Meta's new open LLM

Apr 18

• 272

upvoted a collection 5 months ago

fuck quadratic attention

Collection

11 items • Updated Apr 24 • 20

upvoted a paper 5 months ago

TransformerFAM: Feedback attention is working memory

Paper • 2404.09173 • Published Apr 14 • 43

upvoted an article 5 months ago

Article

Vision Language Models Explained

Apr 11

• 177

upvoted a paper 5 months ago

Direct Nash Optimization: Teaching Language Models to Self-Improve with General Preferences

Paper • 2404.03715 • Published Apr 4 • 59

upvoted a paper 6 months ago

Aurora-M: The First Open Source Multilingual Language Model Red-teamed according to the U.S. Executive Order

Paper • 2404.00399 • Published Mar 30 • 40

upvoted 3 collections 6 months ago

DIBT Prompt collective SPIN

Collection

This collection contains resources related to the replication of SPIN with the dibt prompt collective dataset • 8 items • Updated Jul 30 • 7

Awesome Document AI

Collection

A collection of open-source document AI 📄 📝 📈 • 27 items • Updated Mar 11 • 65

Pre-trained LMs ES

Collection

Monolingual language models pre-trained on Spanish and related languages. • 20 items • Updated 11 days ago • 6

upvoted a collection 7 months ago

Agentics

Collection

13 items • Updated Feb 28 • 1

Manuel Romero PRO

AI & ML interests

Organizations

mrm8488's activity

Fine-tuning LLMs to 1.58bit: extreme quantization made easy

Serverless Inference with Hugging Face and NVIDIA NIMs

Llama-3.1-Storm-8B: Improved SLM with Self-Curation + Model Merging

The 5 Most Under-Rated Tools on Hugging Face

Fine-tune Llama 3.1 Ultra-Efficiently with Unsloth

Introduction to Quantization cooked in 🤗 with 💗🧑‍🍳

Google releases Gemma 2 2B, ShieldGemma and Gemma Scope

WWDC 24: Running Mistral 7B with Core ML

SmolLM - blazingly fast and remarkably powerful

Experimenting with Automatic PII Detection on the Hub using Presidio

The Rise of Agentic Data Generation

In-browser LLM app in pure Python: Gemini Nano + Gradio-Lite

Preference Optimization for Vision Language Models

Going multimodal: How Prezi is leveraging the Hub and the Expert Support Program to accelerate their ML roadmap

Putting RL back in RLHF

Uncensor any LLM with abliteration

Synthetic dataset generation techniques: generating custom sentence similarity data

Train custom AI models with the trainer API and adapt them to 🤗

PaliGemma – Google's Cutting-Edge Open Vision Language Model

How to Finetune phi-3 on MacBook Pro

GaLore: Advancing Large Model Training on Consumer-grade Hardware

seemore: Implement a Vision Language Model from Scratch

Welcome Llama 3 - Meta's new open LLM

Vision Language Models Explained