Papers
arxiv:2403.13248

Mora: Enabling Generalist Video Generation via A Multi-Agent Framework

Published on Mar 20
Β· Submitted by akhaliq on Mar 21
#1 Paper of the day
Authors:
,
,

Abstract

Sora is the first large-scale generalist video generation model that garnered significant attention across society. Since its launch by OpenAI in February 2024, no other video generation models have paralleled {Sora}'s performance or its capacity to support a broad spectrum of video generation tasks. Additionally, there are only a few fully published video generation models, with the majority being closed-source. To address this gap, this paper proposes a new multi-agent framework Mora, which incorporates several advanced visual AI agents to replicate generalist video generation demonstrated by Sora. In particular, Mora can utilize multiple visual agents and successfully mimic Sora's video generation capabilities in various tasks, such as (1) text-to-video generation, (2) text-conditional image-to-video generation, (3) extend generated videos, (4) video-to-video editing, (5) connect videos and (6) simulate digital worlds. Our extensive experimental results show that Mora achieves performance that is proximate to that of Sora in various tasks. However, there exists an obvious performance gap between our work and Sora when assessed holistically. In summary, we hope this project can guide the future trajectory of video generation through collaborative AI agents.

Community

Looking forward to the full releaseπŸš€

Β·
Paper author

We will release it in a short time. Thank you very much!

This is an automated message from the Librarian Bot. I found the following papers similar to this paper.

The following papers were recommended by the Semantic Scholar API

Please give a thumbs up to this comment if you found it helpful!

If you want recommendations for any Paper on Hugging Face checkout this Space

You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: @librarian-bot recommend

Hello, there is a typo in the Mora paper.
In the conclusion, you have repeated the following sentence twice:
"Our thorough evaluation reveals that Mora not only competes with but also exceeds the capabilities of current leading models in certain areas. Our thorough evaluation reveals that Mora not only competes with but also exceeds the capabilities of current leading models in certain areas."

Β·
Paper author

Noticed. Thank you for sharing.

Awesome, Can't wait to try this out! When is the full release coming out?

Revolutionizing Video Generation: Mora's Multi-Agent Framework Explained

Links πŸ”—:

πŸ‘‰ Subscribe: https://www.youtube.com/@Arxflix
πŸ‘‰ Twitter: https://x.com/arxflix
πŸ‘‰ LMNT (Partner): https://lmnt.com/

By Arxflix
9t4iCUHx_400x400-1.jpg

Sign up or log in to comment

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2403.13248 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2403.13248 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2403.13248 in a Space README.md to link it from this page.

Collections including this paper 16