HPC Blogs#
Scaling AI Inference Performance with vLLM on AMD Instinct MI355X GPUs
Explore how MI355X performs against B200 in vLLM benchmarks across DeepSeek-R1, GPT-OSS-120B, Qwen3-235B and Llama-3.3-70B.
Modernizing Taichi Lang to LLVM 20 for MI325X GPU Acceleration
Power your next AI application or graphics simulation with high-performance GPU/CPU computing in Python with Taichi Lang.
HPC Coding Agent - Part 1: Combining GLM-powered Cline and RAG Using MCP
Build an HPC RAG agent on AMD Instinct GPUs using GLM-4.6, Cline and ChromaDB.
Continuing the Momentum: Refining ROCm For The Next Wave Of AI and HPC
ROCm 7.1 builds on 7.0’s AI and HPC advances with faster performance, stronger reliability, and streamlined tools for developers and system builders.
Performance Profiling on AMD GPUs - Part 3: Advanced Usage
Part 3 of our GPU profiling series guides beginners through practical steps to identify and optimize kernel bottlenecks using ROCm tools
ROCm 7.9 Technology Preview: ROCm Core SDK and TheRock Build System
Introduce ROCm Core SDK, and learn to install and build ROCm components easily using TheRock.
Medical Imaging on MI300X: Optimized SwinUNETR for Tumor Detection
Learn how to setup, run and optimize SwinUNETR on AMD MI300X GPUs for fast medical imaging 3D segmentation of tumors using fast, large ROIs.
GPU Partitioning Made Easy: Pack More AI Workloads Using AMD GPU Operator
What’s New in AMD GPU Operator: Learn About GPU Partitioning and New Kubernetes Features
Matrix Core Programming on AMD CDNA™3 and CDNA™4 architecture
This blog post explains how to use Matrix Cores on CDNA3 and CDNA4 architecture, with a focus on low-precision data types such as FP16, FP8, and FP4
Optimizing Drug Discovery Tools on AMD MI300X Part 1: Molecular Design with REINVENT
Learn how to set up, run, and optimize REINVENT4, a molecular design tool, on AMD MI300X GPUs for faster drug discovery workflows
ROCm 7.0: An AI-Ready Powerhouse for Performance, Efficiency, and Productivity
Discover how ROCm 7.0 integrates AI across every layer, combining hardware enablement, frameworks, model support, and a suite of optimized tools
Performance Profiling on AMD GPUs – Part 2: Basic Usage
Part 2 of our GPU profiling series guides beginners through practical steps to identify and optimize kernel bottlenecks using ROCm tools