Developers Blogs - Page 4

Developers Blogs - Page 4#

January 21, 2026

ROCm Becomes a First-Class Platform in the vLLM Ecosystem

ROCm is now a first-class vLLM platform: official wheels + Docker, stronger CI, and faster LLM & multimodal inference on AMD Instinct GPUs.

./software-tools-optimization/vllm-omni/README.html

January 12, 2026

Installing AMD HIP-Enabled GROMACS on HPC Systems: A LUMI Supercomputer Case Study

Installing AMD HIP-Enabled GROMACS on HPC Systems: A LUMI Supercomputer Case Study

./artificial-intelligence/gromacs-lumi-guide/README.html

January 08, 2026

Bridging the Last Mile: Deploying Hummingbird-XT for Efficient Video Generation on AMD Consumer-Grade Platforms

Learn how to use Hummingbird-XT and Hummingbird-XTX modelS to generate videos. Explore the video diffusion model acceleration solution, including dit distillation method and light VAE model.

./artificial-intelligence/hummingbirdxt/README.html

January 02, 2026

Accelerating Multimodal Inference in vLLM: The One-Line Optimization for Large Multimodal Models

Learn how to optimize multimodal model inference with batch-level data parallelism for vision encoders in vLLM, achieving up to 45% throughput gains on AMD MI300X.

./software-tools-optimization/vllm-dp-vision/README.html

December 23, 2025

GEAK HIP: Expanding GEAK for HIP Code Optimization

Explore the GEAK frameworks AI-driven HIP code optimization for improved performance on AMD GPUs, including speedup examples and benefits for AI workloads.

./software-tools-optimization/geak-hip-optimizations/README.html

December 18, 2025

A Step-by-Step Walkthrough of Decentralized LLM Training on AMD GPUs

Learn how to train LLMs across decentralized clusters on AMD Instinct MI300 GPUs with DiLoCo and Prime—scale beyond one datacenter.

./artificial-intelligence/decentralized-training/README.html

December 16, 2025

MoE Training Best Practices on AMD GPUs

Learn how to optimize Mixture-of-Experts (MoE) model training on AMD Instinct GPUs with ROCm. Maximize your AI training performance now!

./software-tools-optimization/primus-moe-package/README.html

December 16, 2025

3D Scene Reconstruction from the Inside: Explore the Mathematics Behind gsplat

3D Scene Reconstruction from the Inside: Explore the Mathematics Behind gsplat

./software-tools-optimization/point-2-gaussian/README.html

December 10, 2025

Medical Imaging on MI300X: SwinUNETR Inference Optimization

A practical guide to optimizing SwinUNETR inference on AMD Instinct™ MI300X GPUs for fast 3D segmentation of tumors in medical imaging.

./artificial-intelligence/swinunetr-inference-optimization/README.html

December 08, 2025

Scaling AI Inference Performance with vLLM on AMD Instinct MI355X GPUs

Explore how MI355X performs against B200 in vLLM benchmarks across DeepSeek-R1, GPT-OSS-120B, Qwen3-235B and Llama-3.3-70B.

./artificial-intelligence/scaling-ai-inference/README.html

November 24, 2025

The vLLM MoE Playbook: A Practical Guide to TP, DP, PP and Expert Parallelism

Learn how to combine TP, DP, PP, and EP for MoE models. Discover proven strategies to maximize performance on your vLLM deployments.

./software-tools-optimization/vllm-moe-guide/README.html

November 05, 2025

Day 0 Developer Guide: hipBLASLt Offline GEMM Tuning Script

Learn how to improve model performance with hipBLASLt offline tuning in our easy-to-use Day 0 tool for developers to optimize GEMM efficiency

./artificial-intelligence/hipblaslt_offline_tuning/README.html

Prev Page 4 of 8 Next