AI Blogs - Page 7

AI Blogs - Page 7#

AMD Integrates llm-d on AMD Instinct MI300X Cluster For Distributed LLM Serving

May 20, 2025 by Kenny Roche, Joe Shajrawi, Andy Luo, Anshul Gupta

Step-Video-T2V Inference with xDiT on AMD Instinct MI300X GPUs

Learn how to accelerate text-to-video generation using Step-Video-T2V, a 30B parameter T2V model, on AMD MI300X GPUs with ROCm—enabling scalable, high-fidelity video generation from text

May 15, 2025 by Wei Cai, George Wang

DataFrame Acceleration: hipDF and hipDF.pandas on AMD GPUs

This blog post demonstrates how hipDF significantly enhances and accelerates data manipulation, aggregation, and transformation tasks on AMD hardware using ROCm.

May 07, 2025 by Fabricio Flores

CuPy and hipDF on AMD: The Basics and Beyond

Learn how to deploy CuPy and hipDF on AMD GPUs. See their high-performance computing advantages, and use CuPy and hipDF in a detailed example of an investment portfolio allocation optimization using the Markowitz model.

May 06, 2025 by Fabricio Flores

Power Up Qwen 3 with AMD Instinct: A Developer’s Day 0 Quickstart

Explore the power of Alibaba's QWEN3 models on AMD Instinct™ MI300X and MI325X GPUs - available from Day 0 with seamless SGLang and vLLM integration

April 28, 2025 by Andy Luo, Bill He, Seungrok Jung, Mahdi Ghodsi

Boosting Llama 4 Inference Performance with AMD Instinct MI300X GPUs

Learn how to boost your Llama 4 inference performance on AMD MI300X GPUs using AITER-optimized kernels and advanced vLLM techniques

April 28, 2025 by Liz Li, Seungrok Jung, Andy Luo, Shekhar Pandey

Beyond Text: Accelerating Multimodal AI Inference with Speculative Decoding on AMD Instinct™ MI300X GPUs

This blog shows you how to speedup your multimodal models with AMD’s open-source PyTorch tools for speculative decoding on MI300X GPUs

April 28, 2025 by Mohammad Mahdi Kamani, Parsa Fashi, Vikram Appia, Emad Barsoum

Reinforcement Learning from Human Feedback on AMD GPUs with verl and ROCm Integration

Deploy verl on AMD GPUs for fast, scalable RLHF training with ROCm optimization, Docker scripts, and impressive throughput-convergence results

April 24, 2025 by Yusheng Su, Vicky Tsang, Yao Liu, Phani Vaddadi, Vish Vadlamani, Zicheng Liu

A Step-by-Step Guide On How To Deploy Llama Stack on AMD Instinct™ GPU

Learn how to use Meta’s Llama Stack with AMD ROCm and vLLM to scale inference, integrate APIs, and streamline production-ready AI workflows on AMD Instinct™ GPU

April 22, 2025 by Alex He

Hands-On with CK-Tile: Develop and Run Optimized GEMM on AMD GPUs

Build high-performance GEMM kernels using CK-Tile on AMD Instinct GPUs with vendor-optimized pipelines and policies for AI and HPC workloads

April 15, 2025 by David Li, George Wang

ROCm 6.4: Breaking Barriers in AI, HPC, and Modular GPU Software

Explore ROCm 6.4's key advancements: AI/HPC performance boosts, enhanced profiling tools, better Kubernetes support and modular drivers, accelerating AI and HPC workloads on AMD GPUs.

April 11, 2025 by Jayacharan Kolla, Aditya Bhattacharji, Farshad Ghodsian, Saad Rahim, Marco Grond, Ronnie Chatterjee

Unlock Peak Performance on AMD GPUs with Triton Kernel Optimizations

Learn how Triton compiles and optimizes AI kernels on AMD GPUs, with deep dives into IR flows, hardware-specific passes, and performance tuning tips

April 10, 2025 by Ning Zhang, George Wang

Prev Page 7 of 17 Next