Collections
Discover the best community collections!
Collections including paper arxiv:2406.06608
-
Mamba: Linear-Time Sequence Modeling with Selective State Spaces
Paper • 2312.00752 • Published • 138 -
Elucidating the Design Space of Diffusion-Based Generative Models
Paper • 2206.00364 • Published • 13 -
GLU Variants Improve Transformer
Paper • 2002.05202 • Published • 1 -
StarCoder 2 and The Stack v2: The Next Generation
Paper • 2402.19173 • Published • 134
-
Instruction Pre-Training: Language Models are Supervised Multitask Learners
Paper • 2406.14491 • Published • 85 -
Better & Faster Large Language Models via Multi-token Prediction
Paper • 2404.19737 • Published • 73 -
RAFT: Adapting Language Model to Domain Specific RAG
Paper • 2403.10131 • Published • 67 -
The Prompt Report: A Systematic Survey of Prompting Techniques
Paper • 2406.06608 • Published • 53
-
Same Task, More Tokens: the Impact of Input Length on the Reasoning Performance of Large Language Models
Paper • 2402.14848 • Published • 18 -
The Prompt Report: A Systematic Survey of Prompting Techniques
Paper • 2406.06608 • Published • 53 -
CRAG -- Comprehensive RAG Benchmark
Paper • 2406.04744 • Published • 41 -
Transformers meet Neural Algorithmic Reasoners
Paper • 2406.09308 • Published • 43