Software Tools and Optimizations - Page 2#
Discover the latest blogs about ROCm software tools, libraries, and performance optimizations to help you get the most out of your AMD hardware.

SGLang: Fast Serving Framework for Large Language and Vision-Language Models on AMD Instinct GPUs
Discover SGLang, a fast serving framework designed for large language and vision-language models on AMD GPUs, supporting efficient runtime and a flexible programming interface.

Getting to Know Your GPU: A Deep Dive into AMD SMI
This post introduces AMD System Management Interface (amd-smi), explaining how you can use it to access your GPU’s performance and status data

Presenting and demonstrating the use of the ROCm Offline Installer Creator, a tool enabling simple deployment of ROCm in disconnected environments in high-security environments and air-gapped networks.
Presenting and demonstrating the use of the ROCm Offline Installer Creator, a tool enabling simple deployment of ROCm in disconnected environments in high-security environments and air-gapped networks.

TensorFlow Profiler in practice: Optimizing TensorFlow models on AMD GPUs
TensorFlow Profiler measures resource use and performance of models, helping identify bottlenecks for optimization. This blog demonstrates the use of the TensorFlow Profiler tool on AMD hardware.

SmoothQuant model inference on AMD Instinct MI300X using Composable Kernel
SmoothQuant model inference on AMD Instinct MI300X using Composable Kernel

AMD in Action: Unveiling the Power of Application Tracing and Profiling
AMD in Action: Unveiling the Power of Application Tracing and Profiling

C++17 parallel algorithms and HIPSTDPAR #
C++17 parallel algorithms and HIPSTDPAR

Affinity part 1 - Affinity, placement, and order
Affinity Part 1

Affinity part 2 - System topology and controlling affinity
Affinity Part 2

Creating a PyTorch/TensorFlow code environment on AMD GPUs
Creating a PyTorch TensorFlow environment on AMD GPUs