Collections
Discover the best community collections!
Collections including paper arxiv:2401.02038
-
LLaMA Beyond English: An Empirical Study on Language Capability Transfer
Paper • 2401.01055 • Published • 54 -
Self-Play Fine-Tuning Converts Weak Language Models to Strong Language Models
Paper • 2401.01335 • Published • 64 -
DocLLM: A layout-aware generative language model for multimodal document understanding
Paper • 2401.00908 • Published • 181 -
Multilingual Instruction Tuning With Just a Pinch of Multilinguality
Paper • 2401.01854 • Published • 10
-
TinyGPT-V: Efficient Multimodal Large Language Model via Small Backbones
Paper • 2312.16862 • Published • 30 -
Unified-IO 2: Scaling Autoregressive Multimodal Models with Vision, Language, Audio, and Action
Paper • 2312.17172 • Published • 26 -
Towards Truly Zero-shot Compositional Visual Reasoning with LLMs as Programmers
Paper • 2401.01974 • Published • 5 -
From Audio to Photoreal Embodiment: Synthesizing Humans in Conversations
Paper • 2401.01885 • Published • 27
-
PIA: Your Personalized Image Animator via Plug-and-Play Modules in Text-to-Image Models
Paper • 2312.13964 • Published • 18 -
LLM in a flash: Efficient Large Language Model Inference with Limited Memory
Paper • 2312.11514 • Published • 258 -
StreamDiffusion: A Pipeline-level Solution for Real-time Interactive Generation
Paper • 2312.12491 • Published • 69 -
LLaVA-φ: Efficient Multi-Modal Assistant with Small Language Model
Paper • 2401.02330 • Published • 14
-
A Picture is Worth More Than 77 Text Tokens: Evaluating CLIP-Style Models on Dense Captions
Paper • 2312.08578 • Published • 16 -
ZeroQuant(4+2): Redefining LLMs Quantization with a New FP6-Centric Strategy for Diverse Generative Tasks
Paper • 2312.08583 • Published • 9 -
Vision-Language Models as a Source of Rewards
Paper • 2312.09187 • Published • 11 -
StemGen: A music generation model that listens
Paper • 2312.08723 • Published • 47
-
Open-Sourcing Highly Capable Foundation Models: An evaluation of risks, benefits, and alternative methods for pursuing open-source objectives
Paper • 2311.09227 • Published • 6 -
defog/sqlcoder-34b-alpha
Text Generation • Updated • 1.71k • 168 -
Understanding LLMs: A Comprehensive Overview from Training to Inference
Paper • 2401.02038 • Published • 61 -
SaulLM-7B: A pioneering Large Language Model for Law
Paper • 2403.03883 • Published • 75
-
GPT4All: An Ecosystem of Open Source Compressed Language Models
Paper • 2311.04931 • Published • 20 -
Can LLMs Follow Simple Rules?
Paper • 2311.04235 • Published • 10 -
Prompt Engineering a Prompt Engineer
Paper • 2311.05661 • Published • 20 -
Orca 2: Teaching Small Language Models How to Reason
Paper • 2311.11045 • Published • 70
-
Levels of AGI: Operationalizing Progress on the Path to AGI
Paper • 2311.02462 • Published • 33 -
Ultra-Long Sequence Distributed Transformer
Paper • 2311.02382 • Published • 2 -
A Survey on Language Models for Code
Paper • 2311.07989 • Published • 21 -
GRIM: GRaph-based Interactive narrative visualization for gaMes
Paper • 2311.09213 • Published • 12
-
In-Context Learning Creates Task Vectors
Paper • 2310.15916 • Published • 41 -
Point Transformer V3: Simpler, Faster, Stronger
Paper • 2312.10035 • Published • 17 -
Extending Context Window of Large Language Models via Semantic Compression
Paper • 2312.09571 • Published • 12 -
PanGu-π: Enhancing Language Model Architectures via Nonlinearity Compensation
Paper • 2312.17276 • Published • 15
-
Understanding LLMs: A Comprehensive Overview from Training to Inference
Paper • 2401.02038 • Published • 61 -
Learning To Teach Large Language Models Logical Reasoning
Paper • 2310.09158 • Published • 1 -
ChipNeMo: Domain-Adapted LLMs for Chip Design
Paper • 2311.00176 • Published • 8 -
WizardMath: Empowering Mathematical Reasoning for Large Language Models via Reinforced Evol-Instruct
Paper • 2308.09583 • Published • 7