Difffusion

xb-chang 's Collections

Efficient LLMs

Reinforcement Learning

LLMs

Noisy datasets

Difffusion

vision language models (VLM)

multimedia

Data Generation

Neural Arch

Video Analysis

updated Jul 22

Upvote

Controlling Space and Time with Diffusion Models

Paper • 2407.07860 • Published Jul 10 • 16

Note a cascaded diffusion model for 4D novel view synthesis conditioned on one or more images of a general scene, and a set of camera poses and timestamps joint training on 3D (with camera pose), 4D (pose+time) and video (time but no pose) data
DisCo-Diff: Enhancing Continuous Diffusion Models with Discrete Latents

Paper • 2407.03300 • Published Jul 3 • 11

Note encoding a complex, potentially multimodal data distribution into a single continuous Gaussian distribution arguably represents an unnecessarily challenging learning problem. 【问题都没看懂】
Diffusion Forcing: Next-token Prediction Meets Full-Sequence Diffusion

Paper • 2407.01392 • Published Jul 1 • 39

Note [2R] This paper presents Diffusion Forcing, a new training paradigm where a diffusion model is trained to denoise a set of tokens with independent per-token noise levels.
No Training, No Problem: Rethinking Classifier-Free Guidance for Diffusion Models

Paper • 2407.02687 • Published Jul 2 • 22

Note 【2R】 Classifier-free guidance (CFG) has become the standard method for enhancing the quality of conditional diffusion models. However, employing CFG requires either training an unconditional model alongside the main diffusion model or modifying the training procedure by periodically inserting a null condition. A new method, independent condition guidance (ICG), which provides the benefits of CFG without the need for any special training procedures.
pOps: Photo-Inspired Diffusion Operators

Paper • 2406.01300 • Published Jun 3 • 16

Note 【2R】 utilizing the CLIP image embedding space for more visually-oriented tasks through methods such as IP-Adapter.

Upvote