This blog, like the previous articles in the profiling guide series (Part 1, Part 2, and Part 3),
is designed to help you systematically analyze and improve the performance of your
Fortran OpenMP offload applications running on AMD GPUs. This guide builds upon the foundational skills
from the previous articles and introduces profiling techniques specifically tailored for Fortran applications
that use OpenMP target offloading.
For an application developer it is often helpful to read the Instruction
Set Architecture (ISA) for the GPU architecture that is used to perform its
computations. Understanding the instructions of the pertinent code
regions of interest can help in debugging and achieving performance
optimization of the application.