Collections
Discover the best community collections!
Collections including paper arxiv:2308.12950
-
Masked Audio Generation using a Single Non-Autoregressive Transformer
Paper ā¢ 2401.04577 ā¢ Published ā¢ 41 -
Code Llama: Open Foundation Models for Code
Paper ā¢ 2308.12950 ā¢ Published ā¢ 22 -
Simple and Controllable Music Generation
Paper ā¢ 2306.05284 ā¢ Published ā¢ 142 -
High Fidelity Neural Audio Compression
Paper ā¢ 2210.13438 ā¢ Published ā¢ 3
-
Design2Code: How Far Are We From Automating Front-End Engineering?
Paper ā¢ 2403.03163 ā¢ Published ā¢ 93 -
Wukong: Towards a Scaling Law for Large-Scale Recommendation
Paper ā¢ 2403.02545 ā¢ Published ā¢ 15 -
StarCoder: may the source be with you!
Paper ā¢ 2305.06161 ā¢ Published ā¢ 29 -
Exploring Parameter-Efficient Fine-Tuning Techniques for Code Generation with Large Language Models
Paper ā¢ 2308.10462 ā¢ Published ā¢ 1
-
StarCoder: may the source be with you!
Paper ā¢ 2305.06161 ā¢ Published ā¢ 29 -
WizardCoder: Empowering Code Large Language Models with Evol-Instruct
Paper ā¢ 2306.08568 ā¢ Published ā¢ 28 -
SantaCoder: don't reach for the stars!
Paper ā¢ 2301.03988 ā¢ Published ā¢ 7 -
DeepSeek-Coder: When the Large Language Model Meets Programming -- The Rise of Code Intelligence
Paper ā¢ 2401.14196 ā¢ Published ā¢ 46
-
Attention Is All You Need
Paper ā¢ 1706.03762 ā¢ Published ā¢ 44 -
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Paper ā¢ 1810.04805 ā¢ Published ā¢ 14 -
RoBERTa: A Robustly Optimized BERT Pretraining Approach
Paper ā¢ 1907.11692 ā¢ Published ā¢ 7 -
DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter
Paper ā¢ 1910.01108 ā¢ Published ā¢ 14
-
ShareGPT4V: Improving Large Multi-Modal Models with Better Captions
Paper ā¢ 2311.12793 ā¢ Published ā¢ 18 -
PhysGaussian: Physics-Integrated 3D Gaussians for Generative Dynamics
Paper ā¢ 2311.12198 ā¢ Published ā¢ 22 -
CoDi-2: In-Context, Interleaved, and Interactive Any-to-Any Generation
Paper ā¢ 2311.18775 ā¢ Published ā¢ 6 -
Code Llama: Open Foundation Models for Code
Paper ā¢ 2308.12950 ā¢ Published ā¢ 22
-
Attention Is All You Need
Paper ā¢ 1706.03762 ā¢ Published ā¢ 44 -
FlashAttention-2: Faster Attention with Better Parallelism and Work Partitioning
Paper ā¢ 2307.08691 ā¢ Published ā¢ 8 -
Mixtral of Experts
Paper ā¢ 2401.04088 ā¢ Published ā¢ 157 -
Mistral 7B
Paper ā¢ 2310.06825 ā¢ Published ā¢ 47
-
Llemma: An Open Language Model For Mathematics
Paper ā¢ 2310.10631 ā¢ Published ā¢ 50 -
Mistral 7B
Paper ā¢ 2310.06825 ā¢ Published ā¢ 47 -
Qwen Technical Report
Paper ā¢ 2309.16609 ā¢ Published ā¢ 34 -
BTLM-3B-8K: 7B Parameter Performance in a 3B Parameter Model
Paper ā¢ 2309.11568 ā¢ Published ā¢ 10
-
Creative Robot Tool Use with Large Language Models
Paper ā¢ 2310.13065 ā¢ Published ā¢ 8 -
CodeCoT and Beyond: Learning to Program and Test like a Developer
Paper ā¢ 2308.08784 ā¢ Published ā¢ 5 -
Lemur: Harmonizing Natural Language and Code for Language Agents
Paper ā¢ 2310.06830 ā¢ Published ā¢ 30 -
CodePlan: Repository-level Coding using LLMs and Planning
Paper ā¢ 2309.12499 ā¢ Published ā¢ 73