TIP-I2V: A Million-Scale Real Text and Image Prompt Dataset for Image-to-Video Generation Paper • 2411.04709 • Published 5 days ago • 21
DreamPolish: Domain Score Distillation With Progressive Geometry Generation Paper • 2411.01602 • Published 7 days ago • 9
Learning Video Representations without Natural Videos Paper • 2410.24213 • Published 10 days ago • 14
LLaMo: Large Language Model-based Molecular Graph Assistant Paper • 2411.00871 • Published 10 days ago • 19
MVPaint: Synchronized Multi-View Diffusion for Painting Anything 3D Paper • 2411.02336 • Published 6 days ago • 23
Training-free Regional Prompting for Diffusion Transformers Paper • 2411.02395 • Published 6 days ago • 22
TOMATO: Assessing Visual Temporal Reasoning Capabilities in Multimodal Foundation Models Paper • 2410.23266 • Published 11 days ago • 19
Adaptive Caching for Faster Video Generation with Diffusion Transformers Paper • 2411.02397 • Published 6 days ago • 17
How Far is Video Generation from World Model: A Physical Law Perspective Paper • 2411.02385 • Published 6 days ago • 27
DynaMath: A Dynamic Visual Benchmark for Evaluating Mathematical Reasoning Robustness of Vision Language Models Paper • 2411.00836 • Published 12 days ago • 14
Adapting While Learning: Grounding LLMs for Scientific Problems with Intelligent Tool Usage Adaptation Paper • 2411.00412 • Published 9 days ago • 9
Unpacking SDXL Turbo: Interpreting Text-to-Image Models with Sparse Autoencoders Paper • 2410.22366 • Published 13 days ago • 71
GrounDiT: Grounding Diffusion Transformers via Noisy Patch Transplantation Paper • 2410.20474 • Published 14 days ago • 13
ROCKET-1: Master Open-World Interaction with Visual-Temporal Context Prompting Paper • 2410.17856 • Published 18 days ago • 48
Distill Visual Chart Reasoning Ability from LLMs to MLLMs Paper • 2410.18798 • Published 17 days ago • 19
Scalable Ranked Preference Optimization for Text-to-Image Generation Paper • 2410.18013 • Published 18 days ago • 14
VidPanos: Generative Panoramic Videos from Casual Panning Videos Paper • 2410.13832 • Published 24 days ago • 12