Applications & models - Page 9#
Explore the latest blogs about applications and models in the ROCm ecosystem, including machine learning frameworks, AI models, and application case studies.
QuickReduce: Up to 3x Faster All-reduce for vLLM and SGLang
Quick Reduce speeds up LLM inference on AMD Instinct™ MI300X GPUs with inline-compressed all-reduce, cutting comms overhead by up to 3×
Introducing AMD EVLM: Efficient Vision-Language Models with Parameter-Space Visual Conditioning
A novel approach that replaces visual tokens with perception-conditioned weights, reducing compute while maintaining strong vision-language performance.
DGL in the Real World: Running GNNs on Real Use Cases
We walk through four advanced GNN workloads from heterogeneous e-commerce graphs to neuroscience applications that we successfully ran using our DGL implementation.
Accelerating FastVideo on AMD GPUs with TeaCache
Enabling ROCm support for FastVideo inference using TeaCache on AMD Instinct GPUs, accelerating video generation with optimized backends
Wan2.2 Fine-Tuning: Tailoring an Advanced Video Generation Model on a Single GPU
Fine-tune Wan2.2 for video generation on a single AMD Instinct MI300X GPU with ROCm and DiffSynth.
All-in-One Video Editing with VACE on AMD Instinct GPUs
This blog showcases AMD hardware powering cutting-edge text-driven video editing models through an all-in-one solution.
Introducing Instella-Math: Fully Open Language Model with Reasoning Capability
Instella-Math is AMD’s 3B reasoning model, trained on 32 MI300X GPUs with open weights, optimized for logic, math, and chain-of-thought tasks.
AMD Hummingbird Image to Video: A Lightweight Feedback-Driven Model for Efficient Image-to-Video Generation
We present AMD Hummingbird, offering a two-stage distillation framework for efficient, high-quality text-to-video generation using compact models.
Accelerating Parallel Programming in Python with Taichi Lang on AMD GPUs
This blog provides a how-to guide on installing and programming with Taichi Lang on AMD Instinct GPUs.
Graph Neural Networks at Scale: DGL with ROCm on AMD Hardware
Accelerate Graph Deep Learning on AMD GPUs with DGL and ROCm—scale efficiently with open tools and optimized performance.
Benchmarking Reasoning Models: From Tokens to Answers
Learn how to benchmark reasoning tasks. Use Qwen3 and vLLM to test true reasoning performance, not just how fast words are generated.
Vibe Coding Pac-Man Inspired Game with DeepSeek-R1 and AMD Instinct MI300X
Learn LLM-powered game dev using DeepSeek-R1 on AMD MI300X GPUs with iterative prompting, procedural generation, and VS Code AI tools