Developers - Software Tools & Optimizations

Developers - Software Tools & Optimizations#

February 23, 2026

Primus-Pipeline: A More Flexible and Scalable Pipeline Parallelism Implementation

Learn how to use our flexible and scalable pipeline parallelism framework with Primus backend and AMD hardware.

./software-tools-optimization/primus-pipeline/README.html

February 20, 2026

FlyDSL: Expert GPU Kernel Development with the Ease of MLIR Python Native DSL on AMD GPUs

FlyDSL is a Python-first, MLIR-native DSL for expert GPU kernel development and tuning on AMD GPUs.

./software-tools-optimization/flydsl-python-native/README.html

February 19, 2026

Introducing hipThreads: A C++ - Style Concurrency Library for AMD GPUs

Discover how hipThreads lets you write hip::thread just like std::thread and unlock GPU acceleration with minimal code changes.

./software-tools-optimization/hipthreads-introduction/README.html

January 30, 2026

Debugging NaN Results in CK Tile GEMM: A rocgdb Detective Story

Learn GPU kernel debugging with rocgdb through a real case: tracing NaN outputs to a one-character typo in CK Tile GEMM

./software-tools-optimization/rocgdb-ck-tile/README.html

January 22, 2026

ROCm 7.2: Smarter, Faster, and More Scalable for Modern AI Workloads

we highlight the latest ROCm 7.2 enhancements for AMD Instinct GPUs, designed to boost AI and HPC performance

./software-tools-optimization/rocm7.2/README.html

January 21, 2026

ROCm Becomes a First-Class Platform in the vLLM Ecosystem

ROCm is now a first-class vLLM platform: official wheels + Docker, stronger CI, and faster LLM & multimodal inference on AMD Instinct GPUs.

./software-tools-optimization/vllm-omni/README.html

January 02, 2026

Accelerating Multimodal Inference in vLLM: The One-Line Optimization for Large Multimodal Models

Learn how to optimize multimodal model inference with batch-level data parallelism for vision encoders in vLLM, achieving up to 45% throughput gains on AMD MI300X.

./software-tools-optimization/vllm-dp-vision/README.html

December 23, 2025

GEAK HIP: Expanding GEAK for HIP Code Optimization

Explore the GEAK frameworks AI-driven HIP code optimization for improved performance on AMD GPUs, including speedup examples and benefits for AI workloads.

./software-tools-optimization/geak-hip-optimizations/README.html