Solomatin Roman's picture

Solomatin Roman

Samoed

·

AI & ML interests

None yet

Organizations

Samoed's activity

upvoted a paper 19 days ago

On the Power of Decision Trees in Auto-Regressive Language Modeling

Paper • 2409.19150 • Published Sep 27 • 4

upvoted a paper 21 days ago

AutoTrain: No-code training for state-of-the-art models

Paper • 2410.15735 • Published 23 days ago • 56

upvoted a paper about 1 month ago

Instruction Following without Instruction Tuning

Paper • 2409.14254 • Published Sep 21 • 27

upvoted an article about 2 months ago

Article

Fine-tuning LLMs to 1.58bit: extreme quantization made easy

Sep 18

• 198

upvoted 2 papers 2 months ago

PingPong: A Benchmark for Role-Playing Language Models with User Emulation and Multi-Model Evaluation

Paper • 2409.06820 • Published Sep 10 • 62

OLMoE: Open Mixture-of-Experts Language Models

Paper • 2409.02060 • Published Sep 3 • 77

upvoted 6 papers 3 months ago

Building and better understanding vision-language models: insights and future directions

Paper • 2408.12637 • Published Aug 22 • 117

LLM Pruning and Distillation in Practice: The Minitron Approach

Paper • 2408.11796 • Published Aug 21 • 53

To Code, or Not To Code? Exploring Impact of Code in Pre-training

Paper • 2408.10914 • Published Aug 20 • 40

The Russian-focused embedders' exploration: ruMTEB benchmark and Russian embedding model design

Paper • 2408.12503 • Published Aug 22 • 21

ShieldGemma: Generative AI Content Moderation Based on Gemma

Paper • 2407.21772 • Published Jul 31 • 13

The Llama 3 Herd of Models

Paper • 2407.21783 • Published Jul 31 • 107

upvoted 4 papers 4 months ago

OpenDevin: An Open Platform for AI Software Developers as Generalist Agents

Paper • 2407.16741 • Published Jul 23 • 68

LAMBDA: A Large Model Based Data Agent

Paper • 2407.17535 • Published Jul 24 • 34

Phi-3 Safety Post-Training: Aligning Language Models with a "Break-Fix" Cycle

Paper • 2407.13833 • Published Jul 18 • 11

CRAG -- Comprehensive RAG Benchmark

Paper • 2406.04744 • Published Jun 7 • 41

upvoted 2 papers 5 months ago

Whiteboard-of-Thought: Thinking Step-by-Step Across Modalities

Paper • 2406.14562 • Published Jun 20 • 27

Instruction Pre-Training: Language Models are Supervised Multitask Learners

Paper • 2406.14491 • Published Jun 20 • 85

upvoted an article 5 months ago

Article

How to generate text: using different decoding methods for language generation with Transformers

Mar 1, 2020

• 109

upvoted a collection 7 months ago

Meta Llama 3

This collection hosts the transformers and original repos of the Meta Llama 3 and Llama Guard 2 releases • 5 items • Updated Sep 25 • 682