AI - Software Tools & Optimizations - Page 3

AI - Software Tools & Optimizations - Page 3#

March 02, 2026

MaxText-Slurm: Production-Grade LLM Training with Built-In Observability

MaxText-Slurm: A unified launch system for production-grade LLM training with observability on AMD GPU clusters.

./software-tools-optimization/maxtext-slurm/README.html

February 24, 2026

JAX-AITER: Bringing AMD’s Optimized AI Kernels to JAX on ROCm™

Use JAX-AITER to run AMD’s AITER-optimized AI kernels from JAX on AMD ROCm, starting with faster multi-head attention and expanding to more ops.

./software-tools-optimization/jax-aiter/README.html

February 24, 2026

Getting Started with AMD Resource Manager: Efficient Sharing of AMD Instinct™ GPUs for R&D Teams and AI Practitioners

Learn how to utilize the AMD Resource Manager by following this step-by-step guide on how to setup projects, share compute resources and monitor resource utilization.

./software-tools-optimization/amd-resource-manager/README.html

February 23, 2026

Primus-Pipeline: A More Flexible and Scalable Pipeline Parallelism Implementation

Learn how to use our flexible and scalable pipeline parallelism framework with Primus backend and AMD hardware.

./software-tools-optimization/primus-pipeline/README.html

February 20, 2026

FlyDSL: Expert GPU Kernel Development with the Ease of MLIR Python Native DSL on AMD GPUs

FlyDSL is a Python-first, MLIR-native DSL for expert GPU kernel development and tuning on AMD GPUs.

./software-tools-optimization/flydsl-python-native/README.html

February 17, 2026

Advanced MXFP4 Quantization: Combining Fine-Tuned Rotations with SmoothQuant for Near-Lossless Compression

Showcase advanced algorithms available in AMD Quark for efficient MXFP4 quantization on AMD Instinct accelerators with high accuracy retention.

./software-tools-optimization/mxfp4-online-rotation/README.html

February 17, 2026

Adaptive Top-K Selection: Eliminating Performance Cliffs Across All K Values on AMD GPUs

Explore adaptive Top-K on MI300X! See how auto-selection and hardware optimizations like DPP and double buffering drive peak efficiency.

./software-tools-optimization/adaptive-topk/README.html

January 22, 2026

LLM Inference Optimization Using AMD GPU Partitioning

Demonstrate how to leverage compute and memory partitioning features in ROCm to scale model serving.

./software-tools-optimization/multi-inf-engine-gpu-partition/README.html

January 22, 2026

ROCm 7.2: Smarter, Faster, and More Scalable for Modern AI Workloads

we highlight the latest ROCm 7.2 enhancements for AMD Instinct GPUs, designed to boost AI and HPC performance

./software-tools-optimization/rocm7.2/README.html

January 21, 2026

ROCm Becomes a First-Class Platform in the vLLM Ecosystem

ROCm is now a first-class vLLM platform: official wheels + Docker, stronger CI, and faster LLM & multimodal inference on AMD Instinct GPUs.

./software-tools-optimization/vllm-omni/README.html

January 15, 2026

Deep Dive into Primus: High-Performance Training for Large Language Models

Learn how to achieve peak dense LLM training performance on AMD Instinct™ GPUs using Primus’s unified CLI and optimized backend presets.

./software-tools-optimization/primus-deep-dive/README.html

January 13, 2026

Reimagining GPU Allocation in Kubernetes: Introducing the AMD GPU DRA Driver

Explore how the AMD GPU DRA Driver brings declarative, attribute-aware GPU scheduling to Kubernetes — learn how to request and manage GPUs natively

./software-tools-optimization/dra-gpu/README.html

Prev Page 3 of 7 Next