EvalCrafter: Benchmarking and Evaluating Large Video Generation Models
Abstract
The vision and language generative models have been overgrown in recent years. For video generation, various open-sourced models and public-available services are released for generating high-visual quality videos. However, these methods often use a few academic metrics, for example, FVD or IS, to evaluate the performance. We argue that it is hard to judge the large conditional generative models from the simple metrics since these models are often trained on very large datasets with multi-aspect abilities. Thus, we propose a new framework and pipeline to exhaustively evaluate the performance of the generated videos. To achieve this, we first conduct a new prompt list for text-to-video generation by analyzing the real-world prompt list with the help of the large language model. Then, we evaluate the state-of-the-art video generative models on our carefully designed benchmarks, in terms of visual qualities, content qualities, motion qualities, and text-caption alignment with around 18 objective metrics. To obtain the final leaderboard of the models, we also fit a series of coefficients to align the objective metrics to the users' opinions. Based on the proposed opinion alignment method, our final score shows a higher correlation than simply averaging the metrics, showing the effectiveness of the proposed evaluation method.
Community
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- VideoDirectorGPT: Consistent Multi-scene Video Generation via LLM-Guided Planning (2023)
- VideoGen: A Reference-Guided Latent Diffusion Approach for High Definition Text-to-Video Generation (2023)
- Emu: Enhancing Image Generation Models Using Photogenic Needles in a Haystack (2023)
- MiniGPT-5: Interleaved Vision-and-Language Generation via Generative Vokens (2023)
- LAVIE: High-Quality Video Generation with Cascaded Latent Diffusion Models (2023)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
Models citing this paper 0
No model linking this paper