HPC - Applications & Models - Page 2#
LLM Quantization with Quark on AMD GPUs: Accuracy and Performance Evaluation
Learn how to use Quark to apply FP8 quantization to LLMs on AMD GPUs, and evaluate accuracy and performance using vLLM and SGLang on AMD MI300X GPUs.
Seismic stencil codes - part 3
Seismic Stencil Codes - Part 3: In the last two blog posts, we developed a HIP kernel capable of computing high order finite differences commonly needed in seismic wave propagation.
Seismic stencil codes - part 1
Seismic Stencil Codes - Part 1: Seismic workloads in the HPC space have a long history of being powered by high-order finite difference methods on structured grids. This trend continues to this day.
Seismic stencil codes - part 2
Seismic Stencil Codes - Part 2: In the previous post, recall that the kernel with stencil computation in the z-direction suffered from low effective bandwidth. This low performance comes from generating substantial amounts of data to movement to global memory.
Using statistical methods to reliably compare algorithm performance in large generative AI models with JAX Profiler on AMD GPUs
Using Statistical Methods to Reliably Compare Algorithm Performance in Large Generative AI Models with JAX Profiler on AMD GPUs
Sparse matrix vector multiplication - part 1
Sparse matrix vector multiplication - Part 1
Jacobi Solver with HIP and OpenMP offloading
Finite difference method - Laplacian Part 1
Finite difference method - Laplacian part 4
Finite difference method - Laplacian Part 4
Finite difference method - Laplacian part 3
Finite difference method - Laplacian Part 3
Finite difference method - Laplacian part 2
Finite difference method - Laplacian Part 2