Models
Datasets
Spaces
Posts
Docs
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2401.11605

PALP: Prompt Aligned Personalization of Text-to-Image Models

Paper • 2401.06105 • Published Jan 11 • 46
Scalable High-Resolution Pixel-Space Image Synthesis with Hourglass Diffusion Transformers

Paper • 2401.11605 • Published Jan 21 • 21

RealCustom: Narrowing Real Text Word for Real-Time Open-Domain Text-to-Image Customization

Paper • 2403.00483 • Published Mar 1 • 12
OOTDiffusion: Outfitting Fusion based Latent Diffusion for Controllable Virtual Try-on

Paper • 2403.01779 • Published Mar 4 • 28
Scalable High-Resolution Pixel-Space Image Synthesis with Hourglass Diffusion Transformers

Paper • 2401.11605 • Published Jan 21 • 21
FiT: Flexible Vision Transformer for Diffusion Model

Paper • 2402.12376 • Published Feb 19 • 48

Image Generation

Scalable High-Resolution Pixel-Space Image Synthesis with Hourglass Diffusion Transformers

Paper • 2401.11605 • Published Jan 21 • 21
Mastering Text-to-Image Diffusion: Recaptioning, Planning, and Generating with Multimodal LLMs

Paper • 2401.11708 • Published Jan 22 • 29
Multi-LoRA Composition for Image Generation

Paper • 2402.16843 • Published Feb 26 • 28
Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction

Paper • 2404.02905 • Published Apr 3 • 64

FreGrad: Lightweight and Fast Frequency-aware Diffusion Vocoder

Paper • 2401.10032 • Published Jan 18 • 12
Lightning Attention-2: A Free Lunch for Handling Unlimited Sequence Lengths in Large Language Models

Paper • 2401.04658 • Published Jan 9 • 24
FreeInit: Bridging Initialization Gap in Video Diffusion Models

Paper • 2312.07537 • Published Dec 12, 2023 • 26
TCNCA: Temporal Convolution Network with Chunked Attention for Scalable Sequence Processing

Paper • 2312.05605 • Published Dec 9, 2023 • 1

StreamDiffusion: A Pipeline-level Solution for Real-time Interactive Generation

Paper • 2312.12491 • Published Dec 19, 2023 • 69
Mastering Text-to-Image Diffusion: Recaptioning, Planning, and Generating with Multimodal LLMs

Paper • 2401.11708 • Published Jan 22 • 29
Training-Free Consistent Text-to-Image Generation

Paper • 2402.03286 • Published Feb 5 • 64
PALP: Prompt Aligned Personalization of Text-to-Image Models

Paper • 2401.06105 • Published Jan 11 • 46

One-for-All: Generalized LoRA for Parameter-Efficient Fine-tuning

Paper • 2306.07967 • Published Jun 13, 2023 • 24
Rerender A Video: Zero-Shot Text-Guided Video-to-Video Translation

Paper • 2306.07954 • Published Jun 13, 2023 • 113
TryOnDiffusion: A Tale of Two UNets

Paper • 2306.08276 • Published Jun 14, 2023 • 73
Seeing the World through Your Eyes

Paper • 2306.09348 • Published Jun 15, 2023 • 33

VideoBooth: Diffusion-based Video Generation with Image Prompts

Paper • 2312.00777 • Published Dec 1, 2023 • 21
MotionCtrl: A Unified and Flexible Motion Controller for Video Generation

Paper • 2312.03641 • Published Dec 6, 2023 • 20
GenTron: Delving Deep into Diffusion Transformers for Image and Video Generation

Paper • 2312.04557 • Published Dec 7, 2023 • 12
DreamVideo: Composing Your Dream Videos with Customized Subject and Motion

Paper • 2312.04433 • Published Dec 7, 2023 • 9

generative-model-training

PixArt-α: Fast Training of Diffusion Transformer for Photorealistic Text-to-Image Synthesis

Paper • 2310.00426 • Published Sep 30, 2023 • 61
A Picture is Worth a Thousand Words: Principled Recaptioning Improves Image Generation

Paper • 2310.16656 • Published Oct 25, 2023 • 40
CommonCanvas: An Open Diffusion Model Trained with Creative-Commons Images

Paper • 2310.16825 • Published Oct 25, 2023 • 32
Scalable High-Resolution Pixel-Space Image Synthesis with Hourglass Diffusion Transformers

Paper • 2401.11605 • Published Jan 21 • 21

The Chosen One: Consistent Characters in Text-to-Image Diffusion Models

Paper • 2311.10093 • Published Nov 16, 2023 • 57
NeuroPrompts: An Adaptive Framework to Optimize Prompts for Text-to-Image Generation

Paper • 2311.12229 • Published Nov 20, 2023 • 26
Diffusion Model Alignment Using Direct Preference Optimization

Paper • 2311.12908 • Published Nov 21, 2023 • 47
VMC: Video Motion Customization using Temporal Attention Adaption for Text-to-Video Diffusion Models

Paper • 2312.00845 • Published Dec 1, 2023 • 36

Generative Multiple Modality

Random Field Augmentations for Self-Supervised Representation Learning

Paper • 2311.03629 • Published Nov 7, 2023 • 6
TEAL: Tokenize and Embed ALL for Multi-modal Large Language Models

Paper • 2311.04589 • Published Nov 8, 2023 • 18
GENOME: GenerativE Neuro-symbOlic visual reasoning by growing and reusing ModulEs

Paper • 2311.04901 • Published Nov 8, 2023 • 7
Q-Instruct: Improving Low-level Visual Abilities for Multi-modality Foundation Models

Paper • 2311.06783 • Published Nov 12, 2023 • 26

Previous
1
2
Next

Company

© Hugging Face

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs