Posts by Sopiko Kurdadze

A Simple Design for Serving Video Generation Models with Distributed Inference

Video generation is entering a new era, powered by diffusion models that deliver photorealistic and temporally consistent results from text prompts. Models like Wan2.2 push the boundaries of what’s possible in AI-generated content, but to make them practical, inference performance needs to scale in real-world terms: handling more simultaneous users, keeping response times reasonable, and efficiently using multiple GPUs or compute nodes.

Read more ...


Accelerating FastVideo on AMD GPUs with TeaCache

Video generation is entering a new era, powered by diffusion models that deliver photorealistic and temporally consistent results from text prompts. Models like Wan2.1 push the boundaries of what’s possible in AI-generated content, but to unlock their full potential, inference performance must scale with both model complexity and hardware capabilities.

Read more ...