Applications & models - Page 3

Applications & models - Page 3#

Explore the latest blogs about applications and models in the ROCm ecosystem, including machine learning frameworks, AI models, and application case studies.

CuPy and hipDF on AMD: The Basics and Beyond

Learn how to deploy CuPy and hipDF on AMD GPUs. See their high-performance computing advantages, and use CuPy and hipDF in a detailed example of an investment portfolio allocation optimization using the Markowitz model.

May 06, 2025 by Fabricio Flores

Power Up Qwen 3 with AMD Instinct: A Developer’s Day 0 Quickstart

Explore the power of Alibaba's QWEN3 models on AMD Instinct™ MI300X and MI325X GPUs - available from Day 0 with seamless SGLang and vLLM integration

April 28, 2025 by Andy Luo, Bill He, Seungrok Jung, Mahdi Ghodsi

Reinforcement Learning from Human Feedback on AMD GPUs with verl and ROCm Integration

Deploy verl on AMD GPUs for fast, scalable RLHF training with ROCm optimization, Docker scripts, and impressive throughput-convergence results

April 24, 2025 by Yusheng Su, Vicky Tsang, Yao Liu, Phani Vaddadi, Vish Vadlamani, Zicheng Liu

Shrink LLMs, Boost Inference: INT4 Quantization on AMD GPUs with GPTQModel

Learn how to compress LLMs with GPTQModel and run them efficiently on AMD GPUs using INT4 quantization, reducing memory use, shrinking model size, and enabling fast inference

April 09, 2025 by Fabricio Flores

Power Up Llama 4 with AMD Instinct: A Developer’s Day 0 Quickstart

Explore the power of Meta’s Llama 4 multimodal models on AMD Instinct™ MI300X and MI325X GPUs - available from Day 0 with seamless vLLM integration

April 06, 2025 by Liz Li, Seungrok Jung, Andy Luo

AMD Instinct™ MI325X GPUs Produce Strong Performance in MLPerf Inference v5.0

We showcase MI325X GPU optimizations that power our MLPerf v5.0 results on Llama 2 70B, highlighting performance tuning, quantization, and vLLM advancements.

April 02, 2025 by Meena Arunachalam, Miro Hodak, Wei-Ting Liao, Poovaiah Palangappa, Eliot Li, AMD Quark Team, AMD Brevitas Team, and AMD Shark Team

Reproducing the AMD Instinct™ GPUs MLPerf Inference v5.0 Submission

A step-by-step guide to reproducing AMD’s MLPerf v5.0 results for Llama 2 70B & SDXL using ROCm on MI325X

April 02, 2025 by Meena Arunachalam, Miro Hodak, Wei-Ting Liao, Karan Verma, Ean Garvey, Kumar Deepak, Giuseppe Franco, Eliot Li, AMD Quark team

Bring FLUX to Life on MI300X: Run and Optimize with Hugging Face Diffusers

The blog will walk you through the FLUX text-to-image diffusion model architecture and show you how to run and optimize it on MI300x.

March 28, 2025 by Liz Li

Accelerating LLM Inference: Up to 3x Speedup on MI300X with Speculative Decoding

This blog demonstrates out-of-the-box performance improvement in LLM inference using speculative decoding on MI300X.

March 27, 2025 by Sonali Singh, Karthik Sangaiah, Shenrun Zhang, Ryan Swann, Ganesh Dasika

Efficient MoE training on AMD ROCm: How-to use Megablocks on AMD GPUs

Learn how to use Megablocks to pre-train GPT2 Mixture of Experts (MoE) model, helping you scale your deep learning models effectiveness on AMD GPUs using ROCm

March 23, 2025 by Fabricio Flores, Rishi Madduri, Yao Liu, Phani Vaddadi, Vish Vadlamani

Deploying Google’s Gemma 3 Model with vLLM on AMD Instinct™ MI300X GPUs: A Step-by-Step Guide

AMD is excited to announce the integration of Google’s Gemma 3 models with AMD Instinct™ MI300X GPUs

March 14, 2025 by Shekhar Pandey, Anshul Gupta

Analyzing the Impact of Tensor Parallelism Configurations on LLM Inference Performance

This blog analyzes how tensor parallelism impacts TCO and Scale for LLM deployments in production.

March 14, 2025 by Eduardo Alvarez

Prev Page 3 of 12 Next