Video Model Challenges

Pre-training video generation models is incredibly GPU intensive, requiring advanced hardware like H100 GPUs. While certain capabilities, such as walking, only emerge at higher parameter scales, models like Mochi one strike a balance by being accessible yet powerful, operating on consumer-grade GPUs. As video generation involves long sequence lengths, the computational demands increase with each iteration, similar to language models generating tokens.

In this clip
From this podcast
Practical AI
Video generation with realistic motion
Related Questions
- What is this clip about?
- What is the main topic of this clip?

Video Model Challenges

In this clip

From this podcast

Practical AI

Video generation with realistic motion

Related Questions

What is this clip about?

What is the main topic of this clip?