The article introduces Wan2.1, a suite of open video foundation models excelling in various tasks like Text-to-Video and Image-to-Video generation. It highlights key features such as SOTA performance on consumer-grade GPUs, support for multiple tasks, and efficient video VAEs. The I2V-14B model, capable of generating 720P videos, is noted for its superior performance across benchmarks.
HunyuanVideo is an open-source video generation model that showcases performance comparable to or superior to leading closed-source models. It includes features like a unified image and video generative architecture, a large language model text encoder, and a causal 3D VAE for spatial-temporal compression.