HPC - Applications & Models#

Running SwinUNETR on AMD MI300X GPUs
Learn how to setup, run and optimize SwinUNETR on AMD MI300X GPUs for fast medical imaging 3D segmentation of tumors using fast, large ROIs.

Optimizing Drug Discovery Tools on AMD MI300X Part 1: Molecular Design with REINVENT
Learn how to set up, run, and optimize REINVENT4, a molecular design tool, on AMD MI300X GPUs for faster drug discovery workflows

LLM Quantization with Quark on AMD GPUs: Accuracy and Performance Evaluation
Learn how to use Quark to apply FP8 quantization to LLMs on AMD GPUs, and evaluate accuracy and performance using vLLM and SGLang on AMD MI300X GPUs.

Seismic stencil codes - part 1
Seismic Stencil Codes - Part 1: Seismic workloads in the HPC space have a long history of being powered by high-order finite difference methods on structured grids. This trend continues to this day.

Seismic stencil codes - part 2
Seismic Stencil Codes - Part 2: In the previous post, recall that the kernel with stencil computation in the z-direction suffered from low effective bandwidth. This low performance comes from generating substantial amounts of data to movement to global memory.

Seismic stencil codes - part 3
Seismic Stencil Codes - Part 3: In the last two blog posts, we developed a HIP kernel capable of computing high order finite differences commonly needed in seismic wave propagation.

Graph analytics on AMD GPUs using Gunrock
Graph analytics on AMD GPUs using Gunrock

Using statistical methods to reliably compare algorithm performance in large generative AI models with JAX Profiler on AMD GPUs
Using Statistical Methods to Reliably Compare Algorithm Performance in Large Generative AI Models with JAX Profiler on AMD GPUs

Mamba on AMD GPUs with ROCm
Best practices of using Mamba on AMD GPUs with ROCm

Sparse matrix vector multiplication - part 1
Sparse matrix vector multiplication - Part 1