Posts in Software tools & optimizations
Reading AMDGCN ISA
- 13 May 2024
For an application developer it is often helpful to read the Instruction Set Architecture (ISA) for the GPU architecture that is used to perform its computations. Understanding the instructions of the pertinent code regions of interest can help in debugging and achieving performance optimization of the application.
AMD in Action: Unveiling the Power of Application Tracing and Profiling
- 07 May 2024
Rocprof is a robust tool designed to analyze and optimize the performance of HIP programs on AMD ROCm platforms, helping developers pinpoint and resolve performance bottlenecks. Rocprof provides a variety of profiling data, including performance counters, hardware traces, and runtime API/activity traces.
Application portability with HIP
- 26 April 2024
Many scientific applications run on AMD-equipped computing platforms and supercomputers, including Frontier, the first Exascale system in the world. These applications, coming from a myriad of science domains, were ported to run on AMD GPUs using the Heterogeneous-compute Interface for Portability (HIP) abstraction layer. HIP enables these High-Performance Computing (HPC) facilities to transition their CUDA codes to run and take advantage of the latest AMD GPUs. The effort involved in porting these scientific applications varies from a few hours to a few weeks and largely depends on the complexity of the original source code. Figure 1 shows several examples of applications that have been ported and the corresponding porting effort.
C++17 parallel algorithms and HIPSTDPAR
- 18 April 2024
The C++17 standard added the concept of parallel algorithms to the
pre-existing C++ Standard Library. The parallel version of algorithms like
std::transform
maintain the same signature as the regular serial version,
except for the addition of an extra parameter specifying the
execution policy
to use. This flexibility allows users that are already
using the C++ Standard Library algorithms to take advantage of multi-core
architectures by just introducing minimal changes to their code.
Programming AMD GPUs with Julia
- 16 April 2024
Julia is a high-level, general-purpose dynamic programming language that automatically compiles to efficient native code via LLVM, and supports multiple platforms. With LLVM, comes the support for programming GPUs, including AMD GPUs.
Affinity part 2 - System topology and controlling affinity
- 16 April 2024
In Part 1 of the Affinity blog series, we looked at the importance of setting affinity for High Performance Computing (HPC) workloads. In this blog post, our goals are the following:
Affinity part 1 - Affinity, placement, and order
- 16 April 2024
Modern hardware architectures are increasingly complex with multiple sockets, many cores in each Central Processing Unit (CPU), Graphical Processing Units (GPUs), memory controllers, Network Interface Cards (NICs), etc. Peripherals such as GPUs or memory controllers will often be local to a CPU socket. Such designs present interesting challenges in optimizing memory access times, data transfer times, etc. Depending on how the system is built, hardware components are connected, and the workload being run, it may be advantageous to use the resources of the system in a specific way. In this article, we will discuss the role of affinity, placement, and order in improving performance for High Performance Computing (HPC) workloads. A short case study is also presented to familiarize you with performance considerations on a node in the Frontier supercomputer. In a follow-up article, we also aim to equip you with the tools you need to understand your system’s hardware topology and set up affinity for your application accordingly.
Creating a PyTorch/TensorFlow code environment on AMD GPUs
- 11 September 2023
Note: This blog was previously part of the AMD lab notes blog series.
GPU-aware MPI with ROCm
- 08 June 2023
Note: This blog was previously part of the AMD lab notes blog series.
Register pressure in AMD CDNA™2 GPUs
- 17 May 2023
Note: This blog was previously part of the AMD lab notes blog series.
Introduction to profiling tools for AMD hardware
- 12 April 2023
Note: This blog was previously part of the AMD lab notes blog series.
AMD Instinct™ MI200 GPU memory space overview
- 09 March 2023
Note: This blog was previously part of the AMD lab notes blog series.
AMD ROCm™ installation
- 26 January 2023
Note: This blog was previously part of the AMD lab notes blog series.
AMD matrix cores
- 14 November 2022
Note: This blog was previously part of the AMD lab notes blog series.