Collections
Discover the best community collections!
Collections including paper arxiv:2401.04577
-
A Picture is Worth More Than 77 Text Tokens: Evaluating CLIP-Style Models on Dense Captions
Paper • 2312.08578 • Published • 16 -
ZeroQuant(4+2): Redefining LLMs Quantization with a New FP6-Centric Strategy for Diverse Generative Tasks
Paper • 2312.08583 • Published • 9 -
Vision-Language Models as a Source of Rewards
Paper • 2312.09187 • Published • 11 -
StemGen: A music generation model that listens
Paper • 2312.08723 • Published • 47
-
Music ControlNet: Multiple Time-varying Controls for Music Generation
Paper • 2311.07069 • Published • 43 -
FLAP: Fast Language-Audio Pre-training
Paper • 2311.01615 • Published • 16 -
MusicAgent: An AI Agent for Music Understanding and Generation with Large Language Models
Paper • 2310.11954 • Published • 24 -
MusicLDM: Enhancing Novelty in Text-to-Music Generation Using Beat-Synchronous Mixup Strategies
Paper • 2308.01546 • Published • 17
-
Idempotent Generative Network
Paper • 2311.01462 • Published • 24 -
Adaptive Shells for Efficient Neural Radiance Field Rendering
Paper • 2311.10091 • Published • 18 -
Generative Powers of Ten
Paper • 2312.02149 • Published • 4 -
DreamVideo: Composing Your Dream Videos with Customized Subject and Motion
Paper • 2312.04433 • Published • 9