-
A Picture is Worth More Than 77 Text Tokens: Evaluating CLIP-Style Models on Dense Captions
Paper β’ 2312.08578 β’ Published β’ 16 -
ZeroQuant(4+2): Redefining LLMs Quantization with a New FP6-Centric Strategy for Diverse Generative Tasks
Paper β’ 2312.08583 β’ Published β’ 9 -
Vision-Language Models as a Source of Rewards
Paper β’ 2312.09187 β’ Published β’ 11 -
StemGen: A music generation model that listens
Paper β’ 2312.08723 β’ Published β’ 47
Collections
Discover the best community collections!
Collections including paper arxiv:2307.09288
-
black-forest-labs/FLUX.1-dev
Text-to-Image β’ Updated β’ 1.32M β’ β’ 6.28k -
openai/whisper-large-v3-turbo
Automatic Speech Recognition β’ Updated β’ 1.26M β’ β’ 1.33k -
meta-llama/Llama-3.2-11B-Vision-Instruct
Image-Text-to-Text β’ Updated β’ 2.39M β’ β’ 915 -
deepseek-ai/DeepSeek-V2.5
Text Generation β’ Updated β’ 14.7k β’ 578
-
Self-Play Preference Optimization for Language Model Alignment
Paper β’ 2405.00675 β’ Published β’ 24 -
FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness
Paper β’ 2205.14135 β’ Published β’ 11 -
Attention Is All You Need
Paper β’ 1706.03762 β’ Published β’ 44 -
FlashAttention-2: Faster Attention with Better Parallelism and Work Partitioning
Paper β’ 2307.08691 β’ Published β’ 8
-
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits
Paper β’ 2402.17764 β’ Published β’ 602 -
Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone
Paper β’ 2404.14219 β’ Published β’ 253 -
Llama 2: Open Foundation and Fine-Tuned Chat Models
Paper β’ 2307.09288 β’ Published β’ 242 -
LLM in a flash: Efficient Large Language Model Inference with Limited Memory
Paper β’ 2312.11514 β’ Published β’ 258