AI Blogs#

Introducing AMD EVLM: Efficient Vision-Language Models with Parameter-Space Visual Conditioning
A novel approach that replaces visual tokens with perception-conditioned weights, reducing compute while maintaining strong vision-language performance.

Primus: A Lightweight, Unified Training Framework for Large Models on AMD GPUs
Primus streamlines LLM training on AMD GPUs with unified configs, multi-backend support, preflight validation, and structured logging.

DGL in the Real World: Running GNNs on Real Use Cases
We walk through four advanced GNN workloads from heterogeneous e-commerce graphs to neuroscience applications that we successfully ran using our DGL implementation.

All-in-One Video Editing with VACE on AMD Instinct GPUs
This blog showcases AMD hardware powering cutting-edge text-driven video editing models through an all-in-one solution.

Day 0 Developer Guide: Running the Latest Open Models from OpenAI on AMD AI Hardware
Day 0 support across our AI hardware ecosystem from our flagship AMD InstinctTM MI355X and MI300X GPUs, AMD Radeon™ AI PRO R700 GPUs and AMD Ryzen™ AI Processors

Unlocking GPU-Accelerated Containers with the AMD Container Toolkit
Simplify GPU acceleration in containers with the AMD Container Toolkit—streamlined setup, runtime hooks, and full ROCm integration.

AMD ROCm: Powering the World's Fastest Supercomputers
Discover how ROCm drives the world’s top supercomputers, from El Capitan to Frontier, and why its shaping the future of scalable, open and sustainable HPC

The ROCm Revisited Series
We present our ROCm Revisited Series. Discover ROCm's role in leading edge supercomputing, its growing ecosystem-from HIP, to developer tools-powering AI, HPC, and data science across multi-GPU and cluster systems

Accelerating FastVideo on AMD GPUs with TeaCache
Enabling ROCm support for FastVideo inference using TeaCache on AMD Instinct GPUs, accelerating video generation with optimized backends

Wan2.2 Fine-Tuning: Tailoring an Advanced Video Generation Model on a Single GPU
Fine-tune Wan2.2 for video generation on a single AMD Instinct MI300X GPU with ROCm and DiffSynth.

Introducing Instella-Math: Fully Open Language Model with Reasoning Capability
Instella-Math is AMD’s 3B reasoning model, trained on 32 MI300X GPUs with open weights, optimized for logic, math, and chain-of-thought tasks.

AMD Hummingbird Image to Video: A Lightweight Feedback-Driven Model for Efficient Image-to-Video Generation
We present AMD Hummingbird, offering a two-stage distillation framework for efficient, high-quality text-to-video generation using compact models.

Running ComfyUI on AMD Instinct
This blog shows how to deploy ComfyUI on AMD Instinct GPUs. The blog explains what ComfyUI is and how it works.

Running ComfyUI in Windows with ROCm on WSL
Run ComfyUI on Windows with ROCm and WSL to harness Radeon GPU power for local AI tasks like Stable Diffusion—no dual-boot needed

GEAK: Introducing Triton Kernel AI Agent & Evaluation Benchmarks
AMD introduces GEAK, an AI agent for generating optimized Triton GPU kernels, achieving up to 63% accuracy and up to 2.59× speedups on MI300X GPUs.

Avoiding LDS Bank Conflicts on AMD GPUs Using CK-Tile Framework
This blog shows how CK-Tile’s XOR-based swizzle optimizes shared memory access in GEMM kernels on AMD GPUs by eliminating LDS bank conflicts
Stay informed
- Subscribe to our RSS feed (Requires an RSS reader available as browser plugins.)
- Signup for the ROCm newsletter
- View our blog statistics