Models
Datasets
Spaces
Posts
Docs
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2307.06304

MobileCLIP: Fast Image-Text Models through Multi-Modal Reinforced Training

Paper • 2311.17049 • Published Nov 28, 2023
DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model

Paper • 2405.04434 • Published May 7 • 13
A Study of Autoregressive Decoders for Multi-Tasking in Computer Vision

Paper • 2303.17376 • Published Mar 30, 2023
Sigmoid Loss for Language Image Pre-Training

Paper • 2303.15343 • Published Mar 27, 2023 • 4

Photorealistic Video Generation with Diffusion Models

Paper • 2312.06662 • Published Dec 11, 2023 • 23
Patch n' Pack: NaViT, a Vision Transformer for any Aspect Ratio and Resolution

Paper • 2307.06304 • Published Jul 12, 2023 • 26

Patch n' Pack: NaViT, a Vision Transformer for any Aspect Ratio and Resolution

Paper • 2307.06304 • Published Jul 12, 2023 • 26

Sora参考论文

OpenAI "Video generation models as world simulators"技术报告后面的参考论文，总共32篇。OpenAI的ImageGPT和Dalle3这两篇缺失，链接已补充到note中。

Unsupervised Learning of Video Representations using LSTMs

Paper • 1502.04681 • Published Feb 16, 2015 • 1
Recurrent Environment Simulators

Paper • 1704.02254 • Published Apr 7, 2017 • 1
World Models

Paper • 1803.10122 • Published Mar 27, 2018 • 1
Generating Videos with Scene Dynamics

Paper • 1609.02612 • Published Sep 8, 2016 • 1

Sora Reference Papers

A collection of all papers referenced in OpenAI's "Video generation models as world simulators" technical report • openai.com/sora

Unsupervised Learning of Video Representations using LSTMs

Paper • 1502.04681 • Published Feb 16, 2015 • 1
Recurrent Environment Simulators

Paper • 1704.02254 • Published Apr 7, 2017 • 1
World Models

Paper • 1803.10122 • Published Mar 27, 2018 • 1
Generating Videos with Scene Dynamics

Paper • 1609.02612 • Published Sep 8, 2016 • 1

Company

© Hugging Face

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs