AMD ROCm™ Blogs

Featured Posts

Day 0 Developer Guide: Running the Latest Open Models from OpenAI on AMD AI Hardware

Day 0 support across our AI hardware ecosystem from our flagship AMD InstinctTM MI355X and MI300X GPUs, AMD Radeon™ AI PRO R700 GPUs and AMD Ryzen™ AI Processors

August 05, 2025 by Andy Luo, Shekhar Pandey, Hongxia Yang, Mahdi Ghodsi, Charles Yang, Niles Burbank, George Wang, Kailash Gogineni, Xun Wang, Zhenyu Gu, Yao Fu, Yanyuan Qin, Anshul Gupta

Benchmarking Reasoning Models: From Tokens to Answers

Learn how to benchmark reasoning tasks. Use Qwen3 and vLLM to test true reasoning performance, not just how fast words are generated.

July 24, 2025 by Dominic Widdows

Introducing ROCm-LS: Accelerating Life Science Workloads with AMD Instinct™ GPUs

Accelerate life science and medical workloads with ROCm-LS, AMDs GPU-optimized toolkit for faster multidimensional image processing and vision.

July 18, 2025 by Soumitra Chatterjee, Karthik Kashyap Thatipamula, Deeksha Goplani, Ish Kool, Anik Chaudhuri, Vikas C Sajjan, Marco Grond

Introducing Instella-Long: A Fully Open Language Model with Long-Context Capability

Learn about Instella-Long: AMD’s open 3B language model supporting 128K context, trained on MI300X GPUs, outperforming peers on long-context benchmarks.

June 11, 2025 by Jialian Wu, Jiang Liu, Sudhanshu Ranjan, Xiaodong Yu, Gowtham Ramesh, Prakamya Mishra, Zicheng Liu, Yusheng Su, Ximeng Sun, Ze Wang, Emad Barsoum

Introducing AMD EVLM: Efficient Vision-Language Models with Parameter-Space Visual Conditioning

A novel approach that replaces visual tokens with perception-conditioned weights, reducing compute while maintaining strong vision-language performance.

August 22, 2025 by Zhenhua Liu, Xuanwu Yin, Dong Li, Emad Barsoum

Primus: A Lightweight, Unified Training Framework for Large Models on AMD GPUs

Primus streamlines LLM training on AMD GPUs with unified configs, multi-backend support, preflight validation, and structured logging.

August 22, 2025 by Wen Xie, Yao Fu, Xiaoming Peng, Xiaobo Chen, Liz Li, Vidushi Goyal, Anshul Gupta

DGL in the Real World: Running GNNs on Real Use Cases

We walk through four advanced GNN workloads from heterogeneous e-commerce graphs to neuroscience applications that we successfully ran using our DGL implementation.

August 20, 2025 by Mukhil Azhagan Mallaiyan Sathiaseelan, Anuya Welling, James Smith, Yao Liu, Phani Vaddadi, Vish Vadlamani

All-in-One Video Editing with VACE on AMD Instinct GPUs

This blog showcases AMD hardware powering cutting-edge text-driven video editing models through an all-in-one solution.

August 19, 2025 by Johanna Yang, Kristoffer Peyron

Ecosystems & Partners

Unlocking GPU-Accelerated Containers with the AMD Container Toolkit

Simplify GPU acceleration in containers with the AMD Container Toolkit—streamlined setup, runtime hooks, and full ROCm integration.

July 03, 2025 by Abhishek Patil

AMD ROCm: Powering the World's Fastest Supercomputers

Discover how ROCm drives the world’s top supercomputers, from El Capitan to Frontier, and why its shaping the future of scalable, open and sustainable HPC

June 10, 2025 by Mohammed Faraaz Mustafa, Saad Rahim

The ROCm Revisited Series

We present our ROCm Revisited Series. Discover ROCm's role in leading edge supercomputing, its growing ecosystem-from HIP, to developer tools-powering AI, HPC, and data science across multi-GPU and cluster systems

June 06, 2025 by Mohammed Faraaz Mustafa, Liam Berry, Saad Rahim

ROCm Revisited: Getting Started with HIP

New to HIP? This blog will introduce you to the HIP runtime API, its key concepts and installation and practical code examples to showcase its functionality.

June 06, 2025 by Liam Berry, Mohammed Faraaz Mustafa, Saad Rahim

Applications & Models

Accelerating FastVideo on AMD GPUs with TeaCache

Enabling ROCm support for FastVideo inference using TeaCache on AMD Instinct GPUs, accelerating video generation with optimized backends

August 19, 2025 by Sopiko Kurdadze

Wan2.2 Fine-Tuning: Tailoring an Advanced Video Generation Model on a Single GPU

Fine-tune Wan2.2 for video generation on a single AMD Instinct MI300X GPU with ROCm and DiffSynth.

August 19, 2025 by Arttu Niemela, Balazs Toth

Introducing Instella-Math: Fully Open Language Model with Reasoning Capability

Instella-Math is AMD’s 3B reasoning model, trained on 32 MI300X GPUs with open weights, optimized for logic, math, and chain-of-thought tasks.

August 09, 2025 by Xiaodong Yu, Jiang Liu, Yusheng Su, Gowtham Ramesh, Zicheng Liu, Prakamya Mishra, Sudhanshu Ranjan, Jialian Wu, Ximeng Sun, Ze Wang, Emad Barsoum

AMD Hummingbird Image to Video: A Lightweight Feedback-Driven Model for Efficient Image-to-Video Generation

We present AMD Hummingbird, offering a two-stage distillation framework for efficient, high-quality text-to-video generation using compact models.

August 03, 2025 by Takashi Isobe, Dong zhou, He Cui, Mengmeng Ge, Dong Li, Emad Barsoum

Software Tools & Optimizations

Stay informed

Subscribe to our RSS feed (Requires an RSS reader available as browser plugins.)
Signup for the ROCm newsletter
View our blog statistics
View the ROCm Developer Hub
Report an issue or request a feature
We are eager to learn from our community! If you would like to contribute to the ROCm Blogs, please submit your technical blog for review at our GitHub. Blog creation can be started through our GitHub issues form.

Contents