TI2V Research

Norm 's Collections

updated about 23 hours ago

CogVideoX: Text-to-Video Diffusion Models with An Expert Transformer

Paper • 2408.06072 • Published Aug 12 • 35
AtomoVideo: High Fidelity Image-to-Video Generation

Paper • 2403.01800 • Published Mar 4 • 20
DimensionX: Create Any 3D and 4D Scenes from a Single Image with Controllable Video Diffusion

Paper • 2411.04928 • Published 5 days ago • 37

Note 1. The initial steps of the denoising process are critical for defining the generated video. In the base model, temporal and spatial alterations occur simultaneously, creating a unified evolution across both dimensions. 2. Spatial information is constructed earlier than temporal information. Specifically, with S-Director, the attention maps reveal that the structural outlines of the final video appear much earlier than with temporal control. 3. concatenate the noisy latent