HPC Blogs#
FlyDSL: Expert GPU Kernel Development with the Ease of MLIR Python Native DSL on AMD GPUs
FlyDSL is a Python-first, MLIR-native DSL for expert GPU kernel development and tuning on AMD GPUs.
Introducing hipThreads: A C++ - Style Concurrency Library for AMD GPUs
Discover how hipThreads lets you write hip::thread just like std::thread and unlock GPU acceleration with minimal code changes.
ROCm 7.2: Smarter, Faster, and More Scalable for Modern AI Workloads
we highlight the latest ROCm 7.2 enhancements for AMD Instinct GPUs, designed to boost AI and HPC performance
Applying Compute Partitioning for Workloads on MI300X GPUs
Learn how to boost MI300X performance using GPU Compute partitioning for parallel workloads like GROMACS and REINVENT
Installing AMD HIP-Enabled GROMACS on HPC Systems: A LUMI Supercomputer Case Study
Installing AMD HIP-Enabled GROMACS on HPC Systems: A LUMI Supercomputer Case Study
Introducing the AMD Network Operator v1.0.0: Simplifying High-Performance Networking for AMD Platforms
Introducing the AMD Network Operator for automating high-performance AI NIC networking in Kubernetes for AI and HPC workloads
Medical Imaging on MI300X: SwinUNETR Inference Optimization
A practical guide to optimizing SwinUNETR inference on AMD Instinct™ MI300X GPUs for fast 3D segmentation of tumors in medical imaging.
Scaling AI Inference Performance with vLLM on AMD Instinct MI355X GPUs
Explore how MI355X performs against B200 in vLLM benchmarks across DeepSeek-R1, GPT-OSS-120B, Qwen3-235B and Llama-3.3-70B.
Modernizing Taichi Lang to LLVM 20 for MI355X GPU Acceleration
Power your next AI application or graphics simulation with high-performance GPU/CPU computing in Python with Taichi Lang.
HPC Coding Agent - Part 1: Combining GLM-powered Cline and RAG Using MCP
Build an HPC RAG agent on AMD Instinct GPUs using GLM-4.6, Cline and ChromaDB.
Continuing the Momentum: Refining ROCm For The Next Wave Of AI and HPC
ROCm 7.1 builds on 7.0’s AI and HPC advances with faster performance, stronger reliability, and streamlined tools for developers and system builders.
Performance Profiling on AMD GPUs - Part 3: Advanced Usage
Part 3 of our GPU profiling series guides beginners through practical steps to identify and optimize kernel bottlenecks using ROCm tools