ROCm Blogs Statistics

ROCm Blogs Statistics#

Explore the growth of ROCm blogs over time.

Blog Count

Monthly Blog Publication

Blogs by Tag

Top Authors

Author	Number of Blogs	Latest Blog	First Blog
Emad Barsoum	43	Introducing Instella-MoE: A State-of-the-Art Fully Open Mixture-of-Experts Language Model July 24, 2026	Enhancing AI Training with AMD ROCm Software January 31, 2025
Eliot Li	29	From Vector Search to Agentic RAG: Building an Enterprise Research Analyst with hipVS July 15, 2026	Scale AI applications with Ray April 01, 2024
Andy Luo	25	Fast Image Generation and Editing with SGLang Diffusion on AMD GPUs July 10, 2026	Best practices for competitive inference optimization on AMD Instinct™ MI300X GPUs January 29, 2025
Dong Li	24	GEAK V3: Agent-Driven, Repository-Level GPU Kernel Optimization across HIP, Triton, and FlyDSL on AMD GPUs July 20, 2026	GEAK: Introducing Triton Kernel AI Agent & Evaluation Benchmarks August 01, 2025
Vish Vadlamani	24	From Vector Search to Agentic RAG: Building an Enterprise Research Analyst with hipVS July 15, 2026	Triton Inference Server with vLLM on AMD GPUs January 08, 2025
Phani Vaddadi	23	From Vector Search to Agentic RAG: Building an Enterprise Research Analyst with hipVS July 15, 2026	Efficient MoE training on AMD ROCm: How-to use MegaBlocks on AMD GPUs March 23, 2025
Peng Sun	21	Scaling MiniMax-M3 Inference with Distributed Serving and Operator Co-Design on AMD Instinct MI355X GPUs July 21, 2026	Supercharge DeepSeek-R1 Inference on AMD Instinct MI300X March 21, 2025
George Wang	20	Local Image and Video Generation on AMD Ryzen™ AI Max+ Processor (Windows) July 14, 2026	GEMM Kernel Optimization For AMD GPUs February 06, 2025
Yao Liu	20	Accelerating ComfyUI Workflows on AMD Instinct™ MI355X GPUs with ROCm May 11, 2026	Triton Inference Server with vLLM on AMD GPUs January 08, 2025
Fabricio Flores	18	From Build to Benchmark: ONNX Model Serving with Triton Inference Server on AMD GPUs May 22, 2026	Building semantic search with SentenceTransformers on AMD April 04, 2024
Zhenyu Gu	17	Introducing Instella-MoE: A State-of-the-Art Fully Open Mixture-of-Experts Language Model July 24, 2026	Day 0 Developer Guide: Running the Latest Open Models from OpenAI on AMD AI Hardware August 05, 2025
Spandan Tiwari	16	Serving NVFP4 Models on AMD Instinct™ MI355 Accelerators July 13, 2026	QuickReduce: Up to 3x Faster All-reduce for vLLM and SGLang August 26, 2025
Sean Song	16	LLM Inference Optimization Using AMD GPU Partitioning January 22, 2026	Fine-tune Llama model with LoRA: Customizing a large language model for question-answering February 01, 2024
Clint Greene	16	Enabling FlashInfer on ROCm for Accelerated LLM Serving October 01, 2025	Accelerating XGBoost with Dask using multiple AMD GPUs January 26, 2024
Anshul Gupta	15	ROCm 7.14: TheRock Goes Production and Expands AMD's AI Software Platform July 15, 2026	GEMM Kernel Optimization For AMD GPUs February 06, 2025
Ashish Sirasao	15	Serving NVFP4 Models on AMD Instinct™ MI355 Accelerators July 13, 2026	QuickReduce: Up to 3x Faster All-reduce for vLLM and SGLang August 26, 2025
Meena Arunachalam	15	Technical Dive into AMD's MLPerf Training v6.0 Submission June 16, 2026	Benchmarking Machine Learning using ROCm and AMD GPUs: Reproducing Our MLPerf Inference Submission August 28, 2024
Miro Hodak	15	Technical Dive into AMD's MLPerf Training v6.0 Submission June 16, 2026	Benchmarking Machine Learning using ROCm and AMD GPUs: Reproducing Our MLPerf Inference Submission August 28, 2024
Liz Li	14	Occupancy Math on the AMD MI355X GPU (CDNA4): A From-First-Principles Guide July 07, 2026	Supercharge DeepSeek-R1 Inference on AMD Instinct MI300X March 21, 2025
Saad Rahim	14	ROCm 7.14: TheRock Goes Production and Expands AMD's AI Software Platform July 15, 2026	ROCm 6.4: Breaking Barriers in AI, HPC, and Modular GPU Software April 11, 2025
Douglas Jia	14	Multinode Fine-Tuning of Stable Diffusion XL on AMD GPUs with Hugging Face Accelerate and OCI's Kubernetes Engine (OKE) October 15, 2024	Pre-training a large language model with Megatron-DeepSpeed on multiple AMD GPUs January 24, 2024
Yao Fu	13	Primus Tuning Agent: Closing the Configuration-Search Loop July 06, 2026	Optimized ROCm Docker for Distributed AI Training March 13, 2025
Zicheng Liu	13	Introducing Instella-MoE: A State-of-the-Art Fully Open Mixture-of-Experts Language Model July 24, 2026	Introducing Instella: New State-of-the-art Fully Open 3B Language Models March 05, 2025
Chunhung Wang	13	Serve Kimi-K2.5-MXFP4 on MI355X with ATOM July 23, 2026	Avoiding LDS Bank Conflicts on AMD GPUs Using CK-Tile Framework July 25, 2025
Gina Sitaraman	13	Performance Profiling on AMD GPUs – Part 5: Profiling-Driven Kernel Optimization with an AI Code-Assist Tool July 16, 2026	AMD matrix cores November 14, 2022
Phillip Dang	13	DBRX Instruct on AMD GPUs July 11, 2024	Simplifying deep learning: A guide to PyTorch Lightning February 08, 2024
Carlus Huang	12	Scaling MiniMax-M3 Inference with Distributed Serving and Operator Co-Design on AMD Instinct MI355X GPUs July 21, 2026	AITER: AI Tensor Engine For ROCm March 21, 2025
Hongxia Yang	12	Efficient MiniMax-M3 Inference on AMD Instinct GPUs with ATOM and ATOMesh July 21, 2026	Accelerated LLM Inference on AMD Instinct™ GPUs with vLLM 0.9.x and ROCm June 28, 2025
Xuanwu Yin	12	Low Kruskal-Rank Adaptation June 11, 2026	Introducing AMD EVLM: Efficient Vision-Language Models with Parameter-Space Visual Conditioning August 22, 2025
Wei Luo	12	Serving NVFP4 Models on AMD Instinct™ MI355 Accelerators July 13, 2026	QuickReduce: Up to 3x Faster All-reduce for vLLM and SGLang August 26, 2025
Karan Verma	12	Technical Dive into AMD's MLPerf Training v6.0 Submission June 16, 2026	Reproducing the AMD Instinct™ GPUs MLPerf Inference v5.0 Submission April 02, 2025
Xinjun Niu	11	Serving NVFP4 Models on AMD Instinct™ MI355 Accelerators July 13, 2026	QuickReduce: Up to 3x Faster All-reduce for vLLM and SGLang August 26, 2025
Vara Lakshmi Bayanagari	11	Distributed fine-tuning of MPT-30B using Composer on AMD GPUs January 28, 2025	Pre-training BERT using Hugging Face & PyTorch on an AMD GPU January 26, 2024
Wen Xie	10	Introducing Instella-MoE: A State-of-the-Art Fully Open Mixture-of-Experts Language Model July 24, 2026	An Introduction to Primus-Turbo: A Library for Accelerating Transformer Models on AMD GPUs September 19, 2025
Thomas Gibson	10	Performance Profiling on AMD GPUs - Part 4: Fortran OpenMP Offload Edition June 01, 2026	Finite difference method - Laplacian part 1 November 14, 2022
Bowen Bao	10	Serve Kimi-K2.5-MXFP4 on MI355X with ATOM July 23, 2026	Technical Dive into AMD's MLPerf Inference v5.1 Submission September 09, 2025
Justin Chang	10	MI300A - Exploring the APU advantage February 09, 2025	Finite difference method - Laplacian part 1 November 14, 2022
Lingpeng Jin	9	Scaling MiniMax-M3 Inference with Distributed Serving and Operator Co-Design on AMD Instinct MI355X GPUs July 21, 2026	AITER: AI Tensor Engine For ROCm March 21, 2025
Clement Lin	9	Serve Kimi-K2.5-MXFP4 on MI355X with ATOM July 23, 2026	Avoiding LDS Bank Conflicts on AMD GPUs Using CK-Tile Framework July 25, 2025
Yixing Xu	9	Low Kruskal-Rank Adaptation June 11, 2026	Slim Down Your Llama: Pruning & Fine-Tuning for Maximum Performance September 09, 2025
Daniel Gustafsson	9	Onboard and Deploy Custom Models in AMD AI Workbench July 23, 2026	Getting Started with AMD AI Workbench: Deploying and Managing AI Workloads December 19, 2025
Marco Grond	8	Hyperloom - Autonomous Agentic Inference Optimization for AMD GPUs July 23, 2026	ROCm 6.4: Breaking Barriers in AI, HPC, and Modular GPU Software April 11, 2025
Rasmus Larsson	8	Deploy an Imaging AMD Solution Blueprint on AMD Radeon™ GPUs July 22, 2026	Retrieval Augmented Generation (RAG) with vLLM, LangChain and Chroma November 04, 2025
Seungrok Jung	8	Efficient MiniMax-M3 Inference on AMD Instinct GPUs with ATOM and ATOMesh July 21, 2026	Large language model inference optimizations on AMD GPUs March 15, 2024
Liam Berry	8	ROCm 7.14: TheRock Goes Production and Expands AMD's AI Software Platform July 15, 2026	ROCm Runfile Installer Is Here! May 22, 2025
Ximeng Sun	8	Introducing Instella-MoE: A State-of-the-Art Fully Open Mixture-of-Experts Language Model July 24, 2026	Introducing Instella: New State-of-the-art Fully Open 3B Language Models March 05, 2025
Jiang Liu	8	Introducing Instella-MoE: A State-of-the-Art Fully Open Mixture-of-Experts Language Model July 24, 2026	Introducing Instella: New State-of-the-art Fully Open 3B Language Models March 05, 2025
Jialian Wu	8	Introducing Instella-MoE: A State-of-the-Art Fully Open Mixture-of-Experts Language Model July 24, 2026	Introducing Instella: New State-of-the-art Fully Open 3B Language Models March 05, 2025
Yusheng Su	8	LuminaSFT: Generating Synthetic Fine-Tuning Data for Small Language Models February 24, 2026	Introducing Instella: New State-of-the-art Fully Open 3B Language Models March 05, 2025
Pratik Prabhanjan Brahma	7	GEAK V3: Agent-Driven, Repository-Level GPU Kernel Optimization across HIP, Triton, and FlyDSL on AMD GPUs July 20, 2026	Introducing Instella: New State-of-the-art Fully Open 3B Language Models March 05, 2025
Ossian O'Reilly	7	Seismic stencil codes - part 2 August 29, 2024	Finite difference method - Laplacian part 1 November 14, 2022
Menghsuan Yang	7	Serve Kimi-K2.5-MXFP4 on MI355X with ATOM July 23, 2026	Avoiding LDS Bank Conflicts on AMD GPUs Using CK-Tile Framework July 25, 2025
Bobo Fang	7	Serve Kimi-K2.5-MXFP4 on MI355X with ATOM July 23, 2026	Avoiding LDS Bank Conflicts on AMD GPUs Using CK-Tile Framework July 25, 2025
Gowtham Ramesh	7	Introducing Instella-MoE: A State-of-the-Art Fully Open Mixture-of-Experts Language Model July 24, 2026	Introducing Instella: New State-of-the-art Fully Open 3B Language Models March 05, 2025
Ze Wang	7	LuminaSFT: Generating Synthetic Fine-Tuning Data for Small Language Models February 24, 2026	Introducing Instella: New State-of-the-art Fully Open 3B Language Models March 05, 2025
Prakamya Mishra	7	Introducing Instella-MoE: A State-of-the-Art Fully Open Mixture-of-Experts Language Model July 24, 2026	Introducing Instella: New State-of-the-art Fully Open 3B Language Models March 05, 2025
Xiaodong Yu	7	LuminaSFT: Generating Synthetic Fine-Tuning Data for Small Language Models February 24, 2026	Introducing Instella: New State-of-the-art Fully Open 3B Language Models March 05, 2025
Vicky Tsang	7	Accelerating ComfyUI Workflows on AMD Instinct™ MI355X GPUs with ROCm May 11, 2026	Scale AI applications with Ray April 01, 2024
Ravi Dwivedula	7	Technical Dive into AMD's MLPerf Training v6.0 Submission June 16, 2026	High-Throughput BERT-L Pre-Training on AMD Instinct™ GPUs: A Practical Guide June 03, 2025
Sarthak Arora	7	Technical Dive into AMD's MLPerf Training v6.0 Submission June 16, 2026	High-Throughput BERT-L Pre-Training on AMD Instinct™ GPUs: A Practical Guide June 03, 2025
Sathish Sanjeevi	7	Technical Dive into AMD's MLPerf Training v6.0 Submission June 16, 2026	High-Throughput BERT-L Pre-Training on AMD Instinct™ GPUs: A Practical Guide June 03, 2025
Su Ann Chong	7	Technical Dive into AMD's MLPerf Training v6.0 Submission June 16, 2026	High-Throughput BERT-L Pre-Training on AMD Instinct™ GPUs: A Practical Guide June 03, 2025
Mukhil Azhagan Mallaiyan Sathiaseelan	7	Accelerating ComfyUI Workflows on AMD Instinct™ MI355X GPUs with ROCm May 11, 2026	Graph Neural Networks at Scale: DGL with ROCm on AMD Hardware July 31, 2025
Guanchen Li	7	Low Kruskal-Rank Adaptation June 11, 2026	Slim Down Your Llama: Pruning & Fine-Tuning for Maximum Performance September 09, 2025
Hattie Wu	6	Scaling MiniMax-M3 Inference with Distributed Serving and Operator Co-Design on AMD Instinct MI355X GPUs July 21, 2026	vLLM-ATOM: Unlocking Native AMD Performance in the vLLM Ecosystem May 07, 2026
Ziqiong Liu	6	GEAK V3: Agent-Driven, Repository-Level GPU Kernel Optimization across HIP, Triton, and FlyDSL on AMD GPUs July 20, 2026	GEAK: Introducing Triton Kernel AI Agent & Evaluation Benchmarks August 01, 2025
Xiaobo Chen	6	Introducing AMD ROCm™ Infera: Scaling Goodput for Agentic AI with Distributed Inference Orchestration July 23, 2026	Primus: A Lightweight, Unified Training Framework for Large Models on AMD GPUs August 22, 2025
Asitav Mishra	6	Performance Profiling on AMD GPUs - Part 4: Fortran OpenMP Offload Edition June 01, 2026	Jacobi Solver with HIP and OpenMP offloading September 15, 2023
Shekhar Pandey	6	Occupancy Math on the AMD MI355X GPU (CDNA4): A From-First-Principles Guide July 07, 2026	Deploying Google’s Gemma 3 Model with vLLM on AMD Instinct™ MI300X GPUs: A Step-by-Step Guide March 14, 2025
Pier Luigi Dovesi	6	Efficient Hyperparameter Optimization for Autonomous Driving Models with AMD Instinct GPU Partitioning July 08, 2026	Elevating 3D Scene Rendering with GSplat October 03, 2025
Shaghayegh Roohi	6	Efficient Hyperparameter Optimization for Autonomous Driving Models with AMD Instinct GPU Partitioning July 08, 2026	Elevating 3D Scene Rendering with GSplat October 03, 2025
Ke Wang	6	Serving NVFP4 Models on AMD Instinct™ MI355 Accelerators July 13, 2026	QuickReduce: Up to 3x Faster All-reduce for vLLM and SGLang August 26, 2025
Vikram Appia	6	Introducing Instella-MoE: A State-of-the-Art Fully Open Mixture-of-Experts Language Model July 24, 2026	Beyond Text: Accelerating Multimodal AI Inference with Speculative Decoding on AMD Instinct™ MI300X GPUs April 28, 2025
Anuya Welling	6	Serving CTR Recommendation Models with Triton Inference Server using the ONNX Runtime Backend April 07, 2026	Graph Neural Networks at Scale: DGL with ROCm on AMD Hardware July 31, 2025
Sudhanshu Ranjan	6	Introducing Instella-MoE: A State-of-the-Art Fully Open Mixture-of-Experts Language Model July 24, 2026	Introducing Instella: New State-of-the-art Fully Open 3B Language Models March 05, 2025
Kristoffer Peyron	6	GEAK V3: Agent-Driven, Repository-Level GPU Kernel Optimization across HIP, Triton, and FlyDSL on AMD GPUs July 20, 2026	All-in-One Video Editing with VACE on AMD Instinct GPUs August 19, 2025
Johanna Yang	6	HPC Coding Agent - Part 2: An MCP Tool for Code Optimization with OpenEvolve March 04, 2026	All-in-One Video Editing with VACE on AMD Instinct GPUs August 19, 2025
Poovaiah Palangappa	6	AMD Instinct™ GPUs MLPerf Inference v6.0 Submission April 01, 2026	AMD Instinct™ MI325X GPUs Produce Strong Performance in MLPerf Inference v5.0 April 02, 2025
Mohammed Faraaz Mustafa	6	ROCm 7.14: TheRock Goes Production and Expands AMD's AI Software Platform July 15, 2026	The ROCm Revisited Series June 06, 2025
Bin Ding	5	GEAK V3: Agent-Driven, Repository-Level GPU Kernel Optimization across HIP, Triton, and FlyDSL on AMD GPUs July 20, 2026	GEAK: Introducing Triton Kernel AI Agent & Evaluation Benchmarks August 01, 2025
Yanyuan Qin	5	Introducing AMD ROCm™ Infera: Scaling Goodput for Agentic AI with Distributed Inference Orchestration July 23, 2026	Day 0 Developer Guide: Running the Latest Open Models from OpenAI on AMD AI Hardware August 05, 2025
Chaojun Hou	5	Introducing Instella-MoE: A State-of-the-Art Fully Open Mixture-of-Experts Language Model July 24, 2026	Stability at Scale: AMD’s Full‑Stack Platform for Large‑Model Training November 04, 2025
Matt Elliott	5	How to Build a vLLM Container for Inference and Benchmarking February 21, 2025	Presenting and demonstrating the use of the ROCm Offline Installer Creator, a tool enabling simple deployment of ROCm in disconnected environments in high-security environments and air-gapped networks. September 10, 2024
Jehandad Khan	5	Towards Feature Complete Triton Support in JAX-Triton July 08, 2026	ROCm MaxText Testing — Decoupled (Offline) and Cloud-Integrated Modes January 06, 2026
Alessandro Fanfarillo	5	Performance Profiling on AMD GPUs - Part 3: Advanced Usage October 23, 2025	Register pressure in AMD CDNA™2 GPUs May 17, 2023
Karthik Kashyap Thatipamula	5	Using Gradient Boosting Libraries on MI300X for Financial Risk Prediction January 08, 2026	Announcing hipCIM: A Cutting-Edge Solution for Accelerated Multidimensional Image Processing July 18, 2025
Ish Kool	5	Using Gradient Boosting Libraries on MI300X for Financial Risk Prediction January 08, 2026	Announcing hipCIM: A Cutting-Edge Solution for Accelerated Multidimensional Image Processing July 18, 2025
Vikas C Sajjan	5	3D Scene Reconstruction from the Inside: Explore the Mathematics Behind gsplat December 16, 2025	Announcing hipCIM: A Cutting-Edge Solution for Accelerated Multidimensional Image Processing July 18, 2025
Vidushi Goyal	5	Primus Tuning Agent: Closing the Configuration-Search Loop July 06, 2026	Primus: A Lightweight, Unified Training Framework for Large Models on AMD GPUs August 22, 2025
Carson Liao	5	Customizing Kernels with hipBLASLt TensileLite GEMM Tuning - Advanced User Guide April 06, 2026	GEMM Tuning within hipBLASLt - Part 1 September 05, 2025
Dominic Widdows	5	LogsLop: A Tiny Summarization Tool for Enormous Log Files July 14, 2026	Benchmarking Reasoning Models: From Tokens to Answers July 24, 2025
Eveline Chen	5	Serve Kimi-K2.5-MXFP4 on MI355X with ATOM July 23, 2026	Accelerating Kimi-K2.5 on AMD Instinct™ MI300X: Optimizing Fused MoE with FlyDSL March 24, 2026
Sean Miller	5	Finite difference method - Laplacian part 4 July 18, 2023	Finite difference method - Laplacian part 1 November 14, 2022
Rajat Arora	5	Jacobi Solver with HIP and OpenMP offloading September 15, 2023	Finite difference method - Laplacian part 1 November 14, 2022
Niko Vuokko	5	Efficient Hyperparameter Optimization for Autonomous Driving Models with AMD Instinct GPU Partitioning July 08, 2026	VLM Fine-Tuning for Robotics on AMD Enterprise AI Suite November 28, 2025
Daniel Huang	5	Local Image and Video Generation on AMD Ryzen™ AI Max+ Processor (Windows) July 14, 2026	AITER-Enabled MLA Layer Inference on AMD Instinct MI300X GPUs August 25, 2025
David Björelind	5	GROMACS on AMD Instinct GPUs: A Complete Build Guide March 24, 2026	Optimizing Drug Discovery Tools on AMD MI300X Part 1: Molecular Design with REINVENT September 19, 2025
Dong Zhou	5	Triton-Based Optimization of Video Sparse Attention on ROCm July 13, 2026	Nitro-E: A 304M Diffusion Transformer Model for High Quality Image Generation October 24, 2025
Sopiko Kurdadze	5	Foundations of Molecular Generation with GP-MoLFormer on AMD Instinct MI300X Accelerators February 03, 2026	Accelerating FastVideo on AMD GPUs with TeaCache August 19, 2025
Pauli Pihajoki	5	Diffusion-based Atmospheric Downscaling on AMD Instinct GPUs May 20, 2026	Running SOTA AI-based Weather Forecasting models on AMD Instinct September 18, 2025
Wei-Ting Liao	5	AMD Instinct™ GPUs MLPerf Inference v6.0 Submission April 01, 2026	AMD Instinct™ MI325X GPUs Produce Strong Performance in MLPerf Inference v5.0 April 02, 2025
Lin Sun	4	From Build to Benchmark: ONNX Model Serving with Triton Inference Server on AMD GPUs May 22, 2026	Coding Agents on AMD GPUs: Fast LLM Pipelines for Developers September 30, 2025
Barsoum Emad	4	Accelerating LLM Inference on AMD GPUs with Low-Latency GEMMs June 29, 2026	Out-of-the-Box ROLL Support on AMD GPUs: Accelerating Reinforcement Learning at Scale June 01, 2026
Chao Xu	4	GEAK V3: Agent-Driven, Repository-Level GPU Kernel Optimization across HIP, Triton, and FlyDSL on AMD GPUs July 20, 2026	GEAK: Introducing Triton Kernel AI Agent & Evaluation Benchmarks August 01, 2025
Jianghui Wang	4	GEAK V3: Agent-Driven, Repository-Level GPU Kernel Optimization across HIP, Triton, and FlyDSL on AMD GPUs July 20, 2026	GEAK: Introducing Triton Kernel AI Agent & Evaluation Benchmarks August 01, 2025
Saptarshi Majumder	4	GEAK V3: Agent-Driven, Repository-Level GPU Kernel Optimization across HIP, Triton, and FlyDSL on AMD GPUs July 20, 2026	GEAK: Introducing Triton Kernel AI Agent & Evaluation Benchmarks August 01, 2025
Cheng Yao	4	Primus Tuning Agent: Closing the Configuration-Search Loop July 06, 2026	MoE Training Best Practices on AMD GPUs December 16, 2025
Yuankai Chen	4	Primus Tuning Agent: Closing the Configuration-Search Loop July 06, 2026	MoE Training Best Practices on AMD GPUs December 16, 2025
Chang Liu	4	Efficient MiniMax-M3 Inference on AMD Instinct GPUs with ATOM and ATOMesh July 21, 2026	Speculative Decoding - Deep Dive March 24, 2025
Felix Li	4	Accelerating LLM Inference on AMD GPUs with Low-Latency GEMMs June 29, 2026	FlyDSL: Expert GPU Kernel Development with the Ease of MLIR Python Native DSL on AMD GPUs February 20, 2026
Pin Siang Tan	4	ROCm Becomes a First-Class Platform in the vLLM Ecosystem January 21, 2026	Accelerated LLM Inference on AMD Instinct™ GPUs with vLLM 0.9.x and ROCm June 28, 2025
Tun Jian Tan	4	ROCm Becomes a First-Class Platform in the vLLM Ecosystem January 21, 2026	Accelerated LLM Inference on AMD Instinct™ GPUs with vLLM 0.9.x and ROCm June 28, 2025
Luka Stanisic	4	Performance Profiling on AMD GPUs - Part 4: Fortran OpenMP Offload Edition June 01, 2026	Performance Profiling on AMD GPUs – Part 1: Foundations June 26, 2025
Amanzhol Salykov	4	Porting High-Performance HIP Kernels to FlyDSL July 09, 2026	Matrix Core Programming on AMD CDNA™3 and CDNA™4 architecture September 30, 2025
Felix Marty	4	Serving NVFP4 Models on AMD Instinct™ MI355 Accelerators July 13, 2026	High-Accuracy MXFP4, MXFP6, and Mixed-Precision Models on AMD GPUs October 29, 2025
Deeksha Goplani	4	3D Scene Reconstruction from the Inside: Explore the Mathematics Behind gsplat December 16, 2025	Announcing hipCIM: A Cutting-Edge Solution for Accelerated Multidimensional Image Processing July 18, 2025
Anshu Raina	4	Primus Tuning Agent: Closing the Configuration-Search Loop July 06, 2026	Primus-Pipeline: A More Flexible and Scalable Pipeline Parallelism Implementation February 23, 2026
Jayacharan Kolla	4	ROCm 6.4: Breaking Barriers in AI, HPC, and Modular GPU Software April 11, 2025	Understanding RCCL Bandwidth and xGMI Performance on AMD Instinct™ MI300X March 02, 2025
Albin Toft	4	HPC Coding Agent - Part 1: Combining GLM-powered Cline and RAG Using MCP December 03, 2025	Running ComfyUI on AMD Instinct August 19, 2025
Yuchen Lin	4	Building and Deploying Custom hipBLASLt Libraries on AMD Instinct GPUs June 18, 2026	Avoiding LDS Bank Conflicts on AMD GPUs Using CK-Tile Framework July 25, 2025
Mark Granroth Wilding	4	Efficient Hyperparameter Optimization for Autonomous Driving Models with AMD Instinct GPU Partitioning July 08, 2026	Elevating 3D Scene Rendering with GSplat October 03, 2025
Haoyang Li	4	QuickReduce INT3 Quantization and Benchmarking on MI355 July 13, 2026	QuickReduce: Up to 3x Faster All-reduce for vLLM and SGLang August 26, 2025
Debasis Mandal	4	Accelerating ComfyUI Workflows on AMD Instinct™ MI355X GPUs with ROCm May 11, 2026	Enabling FlashInfer on ROCm for Accelerated LLM Serving October 01, 2025
Rui Sampaio	4	AMD Inference Microservice (AIM): Production Ready Inference on AMD Instinct™ GPUs November 17, 2025	Optimizing Drug Discovery Tools on AMD MI300X Part 1: Molecular Design with REINVENT September 19, 2025
Tong Shen	4	Efficient and Portable 3D Explorable World Generation on AMD GPUs June 18, 2026	Nitro-T: Training a Text-to-Image Diffusion Model from Scratch in 1 Day July 09, 2025
Jingai Yu	4	Efficient and Portable 3D Explorable World Generation on AMD GPUs June 18, 2026	Nitro-T: Training a Text-to-Image Diffusion Model from Scratch in 1 Day July 09, 2025
Fuwei Yang	4	GEAK V3: Agent-Driven, Repository-Level GPU Kernel Optimization across HIP, Triton, and FlyDSL on AMD GPUs July 20, 2026	Accelerating Autonomous Driving Model Training on AMD ROCm™ Software December 08, 2025
Chao Li	4	Accelerating Large-Scale LLM Inference on AMD Instinct MI350X/MI355X with Eagle3 and AMD Quark July 03, 2026	Day 0 Developer Guide: hipBLASLt Offline GEMM Tuning Script November 05, 2025
Luka Tsabadze	4	Running Variational Quantum Eigensolver with Qiskit Aer on AMD Instinct May 29, 2026	Running SOTA AI-based Weather Forecasting models on AMD Instinct September 18, 2025
Logan Grado	4	Accelerating models on ROCm using PyTorch TunableOp July 03, 2024	Automatic mixed precision in PyTorch using AMD GPUs March 29, 2024
Uma Kannikanti	4	AMD Instinct™ GPUs MLPerf Inference v6.0 Submission April 01, 2026	Reproducing the AMD Instinct™ GPUs MLPerf Inference v5.1 Submission September 09, 2025
Fulu Li	4	AMD Instinct™ GPUs MLPerf Inference v6.0 Submission April 01, 2026	Slim Down Your Llama: Pruning & Fine-Tuning for Maximum Performance September 09, 2025
Alex He	4	Edge-to-Cloud Robotics with AMD ROCm: From Data Collection to Real-Time Inference March 23, 2026	Navigating vLLM Inference with ROCm and Kubernetes February 13, 2025
Zhe Li	4	Breaking the Accuracy-Speed Barrier: How MXFP4/6 Quantization Revolutionizes Image and Video Generation January 07, 2026	Slim Down Your Llama: Pruning & Fine-Tuning for Maximum Performance September 09, 2025
Xiaoming Peng	3	Primus-Pipeline: A More Flexible and Scalable Pipeline Parallelism Implementation February 23, 2026	Primus: A Lightweight, Unified Training Framework for Large Models on AMD GPUs August 22, 2025
Liying Li	3	Introducing AMD ROCm™ Infera: Scaling Goodput for Agentic AI with Distributed Inference Orchestration July 23, 2026	MoE Training Best Practices on AMD GPUs December 16, 2025
Mou Li	3	Introducing AMD ROCm™ Infera: Scaling Goodput for Agentic AI with Distributed Inference Orchestration July 23, 2026	MoE Training Best Practices on AMD GPUs December 16, 2025
Giacomo Capodaglio	3	Performance Profiling on AMD GPUs - Part 3: Advanced Usage October 23, 2025	Performance Profiling on AMD GPUs – Part 1: Foundations June 26, 2025
Lin Zhao	3	Serving NVFP4 Models on AMD Instinct™ MI355 Accelerators July 13, 2026	High-Accuracy MXFP4, MXFP6, and Mixed-Precision Models on AMD GPUs October 29, 2025
Soumitra Chatterjee	3	Announcing MONAI 1.0.0 for AMD ROCm: Breakthrough AI Acceleration for Medical Imaging Models on AMD Instinct™ GPUs October 07, 2025	Announcing hipCIM: A Cutting-Edge Solution for Accelerated Multidimensional Image Processing July 18, 2025
Anik Chaudhuri	3	Announcing MONAI 1.0.0 for AMD ROCm: Breakthrough AI Acceleration for Medical Imaging Models on AMD Instinct™ GPUs October 07, 2025	Announcing hipCIM: A Cutting-Edge Solution for Accelerated Multidimensional Image Processing July 18, 2025
Shrey Ajmera	3	Spur: Modern GPU Job Scheduling for HPC and AI Workloads July 22, 2026	Introducing the AMD Network Operator v1.0.0: Simplifying High-Performance Networking for AMD Platforms January 08, 2026
YangWen Huang	3	Customizing Kernels with hipBLASLt TensileLite GEMM Tuning - Advanced User Guide April 06, 2026	GEMM Tuning within hipBLASLt - Part 1 September 05, 2025
Suyash Tandon	3	Introduction to profiling tools for AMD hardware April 10, 2026	Application portability with HIP April 26, 2024
Bob Robey	3	Application portability with HIP April 26, 2024	Affinity part 1 - Affinity, placement, and order April 16, 2024
Yi Huang	3	Dropless MoE Training in JAX with Primus-Turbo June 10, 2026	MaxText-Slurm: Production-Grade LLM Training with Built-In Observability March 02, 2026
Wei Cai	3	Deep Dive into Primus: High-Performance Training for Large Language Models January 15, 2026	Step-Video-T2V Inference with xDiT on AMD Instinct MI300X GPUs May 15, 2025
Yue Liu	3	GEAK V3: Agent-Driven, Repository-Level GPU Kernel Optimization across HIP, Triton, and FlyDSL on AMD GPUs July 20, 2026	Streamlining Recommendation Model Training on AMD Instinct™ GPUs March 02, 2026
Andy Ye	3	Dropless MoE Training in JAX with Primus-Turbo June 10, 2026	Optimizing LLM Workloads: AMD Instinct MI355X GPUs Drive Competitive Performance December 02, 2025
Farshad Ghodsian	3	ROCm 6.4: Breaking Barriers in AI, HPC, and Modular GPU Software April 11, 2025	Announcing the AMD GPU Operator and Metrics Exporter January 29, 2025
Gene Su	3	MaxText-Slurm: Production-Grade LLM Training with Built-In Observability March 02, 2026	Optimizing LLM Workloads: AMD Instinct MI355X GPUs Drive Competitive Performance December 02, 2025
Akhila Yeruva	3	Closing the GPU Cluster Validation Gap: A Kubernetes-Native Approach with CVF July 29, 2026	AMD Device Metrics Exporter v1.4.2: Enhanced Observability, Deeper RAS Insights, and Smarter GPU Telemetry for Modern HPC & AI Clusters March 23, 2026
Jeff Daily	3	SPIR-V on ROCm: A Portable IR for AMD GPUs July 20, 2026	Empowering Developers to Build a Robust PyTorch Ecosystem on AMD ROCm™ with Better Insights and Monitoring October 21, 2025
David Li	3	Avoiding LDS Bank Conflicts on AMD GPUs Using CK-Tile Framework July 25, 2025	Hands-On with CK-Tile: Develop and Run Optimized GEMM on AMD GPUs April 15, 2025
Lei Wei	3	Introducing AMD ROCm™ Infera: Scaling Goodput for Agentic AI with Distributed Inference Orchestration July 23, 2026	Stability at Scale: AMD’s Full‑Stack Platform for Large‑Model Training November 04, 2025
Ning Zhang	3	Step-3 Deployment Simplified: A Day 0 Developer’s Guide on AMD Instinct™ GPUs September 04, 2025	GEMM Kernel Optimization For AMD GPUs February 06, 2025
Tiffany Mintz	3	Modernizing Taichi Lang to LLVM 20 for MI355X GPU Acceleration December 04, 2025	Triton Inference Server with vLLM on AMD GPUs January 08, 2025
Rahul Biswas	3	ORBIT-2 based Weather and Climate Downscaling and Downscaled Global Forecasts on AMD Instinct June 08, 2026	Running SOTA AI-based Weather Forecasting models on AMD Instinct September 18, 2025
Rishi Madduri	3	FlashInfer on ROCm: High‑Throughput Prefill Attention via AITER April 06, 2026	Efficient MoE training on AMD ROCm: How-to use MegaBlocks on AMD GPUs March 23, 2025
Fan Wang	3	GEAK V3: Agent-Driven, Repository-Level GPU Kernel Optimization across HIP, Triton, and FlyDSL on AMD GPUs July 20, 2026	Accelerating Autonomous Driving Model Training on AMD ROCm™ Software December 08, 2025
Baiqiang Xia	3	GEAK V3: Agent-Driven, Repository-Level GPU Kernel Optimization across HIP, Triton, and FlyDSL on AMD GPUs July 20, 2026	Running SOTA AI-based Weather Forecasting models on AMD Instinct September 18, 2025
Mehdi Saeedi	3	GEAK V3: Agent-Driven, Repository-Level GPU Kernel Optimization across HIP, Triton, and FlyDSL on AMD GPUs July 20, 2026	Training a Robotic Arm Using MuJoCo and JAX on AMD Hardware with ROCm™ March 31, 2026
Arttu Niemela	3	HPC Coding Agent - Part 3: MCP Tool for Profiling March 06, 2026	Wan2.2 Fine-Tuning: Tailoring an Advanced Video Generation Model on a Single GPU August 19, 2025
Victor Robles	3	AI Inference Orchestration with Kubernetes on Instinct MI300X, Part 3 March 13, 2025	AI Inference Orchestration with Kubernetes on Instinct MI300X, Part 1 February 07, 2025
Mahdi Ghodsi	3	Day 0 Developer Guide: Running the Latest Open Models from OpenAI on AMD AI Hardware August 05, 2025	Power Up Qwen 3 with AMD Instinct: A Developer’s Day 0 Quickstart April 28, 2025
Chun Fang	3	Fast Image Generation and Editing with SGLang Diffusion on AMD GPUs July 10, 2026	Scaling AI Inference Performance with vLLM on AMD Instinct MI355X GPUs December 08, 2025
Yu Wang	3	AMD Enterprise AI Suite: Open Infrastructure for Production AI November 17, 2025	AMD Advances Enterprise AI Through OPEA Integration March 12, 2025
Sebastian Remander	3	GROMACS on AMD Instinct GPUs: A Complete Build Guide March 24, 2026	Installing AMD HIP-Enabled GROMACS on HPC Systems: A LUMI Supercomputer Case Study January 12, 2026
Ean Garvey	3	Technical Dive into AMD's MLPerf Inference v5.1 Submission September 09, 2025	Reproducing the AMD Instinct™ GPUs MLPerf Inference v5.0 Submission April 02, 2025
Kumar Deepak	3	Technical Dive into AMD's MLPerf Inference v5.1 Submission September 09, 2025	Reproducing the AMD Instinct™ GPUs MLPerf Inference v5.0 Submission April 02, 2025
Charles Yang	3	Optimizing FP4 Mixed-Precision Inference with Petit on AMD Instinct MI250 and MI300 GPUs: A Developer’s Perspective October 06, 2025	Vibe Coding Pac-Man Inspired Game with DeepSeek-R1 and AMD Instinct MI300X July 17, 2025
Neha Mathews	3	AMD Instinct™ GPUs MLPerf Inference v6.0 Submission April 01, 2026	Reproducing the AMD Instinct™ GPUs MLPerf Inference v5.1 Submission September 09, 2025
Rajesh Poornachandran	3	AMD Instinct™ GPUs MLPerf Inference v6.0 Submission April 01, 2026	Technical Dive into AMD's MLPerf Inference v5.1 Submission September 09, 2025
Nico Holmberg	3	AMD Instinct™ GPUs MLPerf Inference v6.0 Submission April 01, 2026	Scaling AI Inference Performance with vLLM on AMD Instinct MI355X GPUs December 08, 2025
Satya Jandhyala	3	Exploring Use Cases for Scalable AI: Implementing Ray with ROCm 7 Support for Efficient ML Workflows February 27, 2026	Reinforcement Learning from Human Feedback on AMD GPUs with verl and ROCm 7.0.0 February 12, 2026
Ted Themistokleous	2	From Build to Benchmark: ONNX Model Serving with Triton Inference Server on AMD GPUs May 22, 2026	Triton Inference Server with vLLM on AMD GPUs January 08, 2025
Jorge Parada	2	From Build to Benchmark: ONNX Model Serving with Triton Inference Server on AMD GPUs May 22, 2026	Scale LLM Inference with Multi-Node Infrastructure May 30, 2025
Chuan Li	2	ATOM: Unlocking Extreme AMD Instinct Inference with Software-Hardware Co-Optimization June 15, 2026	vLLM-ATOM: Unlocking Native AMD Performance in the vLLM Ecosystem May 07, 2026
Ethan Yang	2	GEAK V3: Agent-Driven, Repository-Level GPU Kernel Optimization across HIP, Triton, and FlyDSL on AMD GPUs July 20, 2026	GEAK HIP: Expanding GEAK for HIP Code Optimization December 23, 2025
Kevin Joseph	2	From Vector Search to Agentic RAG: Building an Enterprise Research Analyst with hipVS July 15, 2026	Accelerating Vector Search: hipVS and hipRAFT on AMD November 13, 2025
Grant Pinkert	2	From Vector Search to Agentic RAG: Building an Enterprise Research Analyst with hipVS July 15, 2026	Plug-and-Play CuPy on ROCm: Data Analytics Acceleration Made Simple November 14, 2025
Sukriti Choudhary	2	From Vector Search to Agentic RAG: Building an Enterprise Research Analyst with hipVS July 15, 2026	Accelerating Vector Search: hipVS and hipRAFT on AMD November 13, 2025
Sujin Philip	2	From Vector Search to Agentic RAG: Building an Enterprise Research Analyst with hipVS July 15, 2026	Accelerating Vector Search: hipVS and hipRAFT on AMD November 13, 2025
Lalith Narasimhan	2	From Vector Search to Agentic RAG: Building an Enterprise Research Analyst with hipVS July 15, 2026	Accelerating Vector Search: hipVS and hipRAFT on AMD November 13, 2025
Zhen Huang	2	Dropless MoE Training in JAX with Primus-Turbo June 10, 2026	MoE Training Best Practices on AMD GPUs December 16, 2025
Xiaobing Zhang	2	Scaling MiniMax-M3 Inference with Distributed Serving and Operator Co-Design on AMD Instinct MI355X GPUs July 21, 2026	Accelerating LLM Inference on AMD GPUs with Low-Latency GEMMs June 29, 2026
Kaiping Lu	2	RDC and RocProfiler Compared to DCGM for Commonly Used Metrics July 07, 2026	Optimizing MI300X Inter-Chiplet Communication via the RCCL Tuner API June 30, 2026
Kerwin Tsai	2	Optimizing MI300X Inter-Chiplet Communication via the RCCL Tuner API June 30, 2026	Digital Twins on AMD: Building Robotic Simulations Using Edge AI PCs February 09, 2026
Shijie Feng	2	Getting Started with FlyDSL Nightly Wheels on ROCm April 20, 2026	FlyDSL: Expert GPU Kernel Development with the Ease of MLIR Python Native DSL on AMD GPUs February 20, 2026
Dewei Wang	2	Getting Started with FlyDSL Nightly Wheels on ROCm April 20, 2026	FlyDSL: Expert GPU Kernel Development with the Ease of MLIR Python Native DSL on AMD GPUs February 20, 2026
Jun Kang Chow	2	Accelerating Multimodal Inference in vLLM: The One-Line Optimization for Large Multimodal Models January 02, 2026	The vLLM MoE Playbook: A Practical Guide to TP, DP, PP and Expert Parallelism November 24, 2025
Ye Hur Cheong	2	Accelerating Multimodal Inference in vLLM: The One-Line Optimization for Large Multimodal Models January 02, 2026	The vLLM MoE Playbook: A Practical Guide to TP, DP, PP and Expert Parallelism November 24, 2025
Jinze Li	2	FLy: A New Paradigm for Speculative Decoding — Accepting Semantically Correct Drafts Beyond Exact Match April 20, 2026	Gumiho: A New Paradigm for Speculative Decoding — Earlier Tokens in a Draft Sequence Matter More October 14, 2025
Jiangyong Ren	2	Advanced MXFP4 Quantization: Combining Fine-Tuned Rotations with SmoothQuant for Near-Lossless Compression February 17, 2026	QuickReduce: Up to 3x Faster All-reduce for vLLM and SGLang August 26, 2025
Zhaofeng Zhang	2	Advanced MXFP4 Quantization: Combining Fine-Tuned Rotations with SmoothQuant for Near-Lossless Compression February 17, 2026	High-Accuracy MXFP4, MXFP6, and Mixed-Precision Models on AMD GPUs October 29, 2025
Lei Zhang antiagainst	2	Attention Decode on AMD MI450 GPUs: A Gluon Kernel Optimization Guide July 27, 2026	Unleash Full GPU Potential: Overlap Communication and Computation with Triton-Distributed May 06, 2025
Fan Wu	2	Kimi-K2-Instruct: Enhanced Out-of-the-Box Performance on AMD Instinct MI355 Series GPUs October 16, 2025	Unleash Full GPU Potential: Overlap Communication and Computation with Triton-Distributed May 06, 2025
Devang Patel	2	Primus Tuning Agent: Closing the Configuration-Search Loop July 06, 2026	Primus Projection: Estimate Memory and Performance Before You Train April 24, 2026
Peyman Razaghi	2	Primus Tuning Agent: Closing the Configuration-Search Loop July 06, 2026	Primus Projection: Estimate Memory and Performance Before You Train April 24, 2026
William Anzen	2	Leveraging AMD AI Workbench and Autoscaling to Scale LLM Inference for Optimal Resource Utilization March 31, 2026	AMD Inference Microservice (AIM): Production Ready Inference on AMD Instinct™ GPUs November 17, 2025
Yan Sun	2	Spur: Modern GPU Job Scheduling for HPC and AI Workloads July 22, 2026	Reimagining GPU Allocation in Kubernetes: Introducing the AMD GPU DRA Driver January 13, 2026
Juergen Frick	2	Spur: Modern GPU Job Scheduling for HPC and AI Workloads July 22, 2026	Introducing ROCm™ AMD Infinity Context: A Purpose-Built KV Cache Tier for Distributed Inference July 22, 2026
Ben Sander	2	Measuring Max-Achievable FLOPs – Part 2 February 28, 2025	Understanding Peak, Max-Achievable & Delivered FLOPs, Part 1 February 14, 2025
Henry Ho	2	Customizing Kernels with hipBLASLt TensileLite GEMM Tuning - Advanced User Guide April 06, 2026	Measuring Max-Achievable FLOPs – Part 2 February 28, 2025
Christian Gilli	2	Porting High-Performance HIP Kernels to FlyDSL July 09, 2026	Deep Dive Into 4-Wave Interleave FP8 GEMM May 27, 2026
Maria Ruiz Varela	2	Application portability with HIP April 26, 2024	AMD Instinct™ MI200 GPU memory space overview March 09, 2023
Mohit Deopujari	2	Elevate Your LLM Inference: Autoscaling with Ray, ROCm 7.0.0, and SkyPilot February 13, 2026	LLM Inference Optimization Using AMD GPU Partitioning January 22, 2026
Damon McDougall	2	GPU-aware MPI with ROCm June 08, 2023	AMD matrix cores November 14, 2022
Noel Chalmers	2	GPU-aware MPI with ROCm June 08, 2023	AMD matrix cores November 14, 2022
Dan Li	2	GEAK V3: Agent-Driven, Repository-Level GPU Kernel Optimization across HIP, Triton, and FlyDSL on AMD GPUs July 20, 2026	GEAK Agent-Driven Optimization of the DeepSeekV4 MLA Kernel July 13, 2026
Arthur Huang	2	GEAK V3: Agent-Driven, Repository-Level GPU Kernel Optimization across HIP, Triton, and FlyDSL on AMD GPUs July 20, 2026	GEAK Agent-Driven Optimization of the DeepSeekV4 MLA Kernel July 13, 2026
Adeem Jassani	2	TraceLens: Democratizing AI Performance Analysis April 27, 2026	A Step-by-Step Walkthrough of Decentralized LLM Training on AMD GPUs December 18, 2025
Deval Shah	2	Accelerating Mixture-of-Experts Execution with FarSkip-Collective Models May 05, 2026	TraceLens: Democratizing AI Performance Analysis April 27, 2026
Qiang Han	2	Dropless MoE Training in JAX with Primus-Turbo June 10, 2026	MaxText-Slurm: Production-Grade LLM Training with Built-In Observability March 02, 2026
Huasha Zhao	2	Dropless MoE Training in JAX with Primus-Turbo June 10, 2026	MaxText-Slurm: Production-Grade LLM Training with Built-In Observability March 02, 2026
Yuhua Zhu	2	Scaling MiniMax-M3 Inference with Distributed Serving and Operator Co-Design on AMD Instinct MI355X GPUs July 21, 2026	SGLang-ATOM: Bring ROCm-Native Acceleration to SGLang Serving July 08, 2026
Mehdi Rezagholizadeh	2	AgentKernelArena: Benchmarking AI Coding Agents for GPU Kernel Optimization on AMD Instinct GPUs July 03, 2026	AMD-HybridLM: Towards Extremely Efficient Hybrid Language Models September 17, 2025
Akshay Viswanathan	2	Deploy an Imaging AMD Solution Blueprint on AMD Radeon™ GPUs July 22, 2026	Getting Started with AMD Resource Manager: Efficient Sharing of AMD Instinct™ GPUs for R&D Teams and AI Practitioners February 24, 2026
Ammar Elwazir	2	Introducing ROCprofiler SDK - The Latest Toolkit for Performance Profiling March 25, 2025	Introducing ROCprofiler SDK - The Latest Toolkit for Performance Profiling. March 25, 2025
Mittul Singh	2	Installing AMD HIP-Enabled GROMACS on HPC Systems: A LUMI Supercomputer Case Study January 12, 2026	3D Scene Reconstruction from the Inside: Explore the Mathematics Behind gsplat December 16, 2025
Claire Lee	2	MaxText-Slurm: Production-Grade LLM Training with Built-In Observability March 02, 2026	Streamlining Recommendation Model Training on AMD Instinct™ GPUs March 02, 2026
Juho Vainio	2	Getting Started with AMD AI Workbench: Deploying and Managing AI Workloads December 19, 2025	AMD Inference Microservice (AIM): Production Ready Inference on AMD Instinct™ GPUs November 17, 2025
Kenny Roche	2	ROCm Becomes a First-Class Platform in the vLLM Ecosystem January 21, 2026	AMD Integrates llm-d on AMD Instinct MI300X Cluster For Distributed LLM Serving May 20, 2025
Doug Lehr	2	ROCm Becomes a First-Class Platform in the vLLM Ecosystem January 21, 2026	QuickReduce: Up to 3x Faster All-reduce for vLLM and SGLang August 26, 2025
Zejun Chen	2	Scaling MiniMax-M3 Inference with Distributed Serving and Operator Co-Design on AMD Instinct MI355X GPUs July 21, 2026	vLLM-ATOM: Unlocking Native AMD Performance in the vLLM Ecosystem May 07, 2026
Lixun Zhang	2	Attention Decode on AMD MI450 GPUs: A Gluon Kernel Optimization Guide July 27, 2026	From Naive to Near-Peak: Building High-Performance GEMM Kernels with Gluon May 22, 2026
Jason Furmanek	2	OpenXLA and JAX - ROCm Support and the State of CI June 29, 2026	From Naive to Near-Peak: Building High-Performance GEMM Kernels with Gluon May 22, 2026
Haocong Wang	2	Avoiding LDS Bank Conflicts on AMD GPUs Using CK-Tile Framework July 25, 2025	From Theory to Kernel: Implement FlashAttention-v2 with CK-Tile May 21, 2025
Bill He	2	Practical, Fault‑Robust Distributed Inference for DeepSeek on AMD MI300X November 12, 2025	Power Up Qwen 3 with AMD Instinct: A Developer’s Day 0 Quickstart April 28, 2025
Arsalan Farooq	2	Hyperloom - Autonomous Agentic Inference Optimization for AMD GPUs July 23, 2026	GEAK V3: Agent-Driven, Repository-Level GPU Kernel Optimization across HIP, Triton, and FlyDSL on AMD GPUs July 20, 2026
Marc Dillon	2	Efficient GPU Utilization With Workload Pre-Emption in AMD Resource Manager June 26, 2026	AMD Inference Microservice (AIM): Production Ready Inference on AMD Instinct™ GPUs November 17, 2025
George Markomanolis	2	Affinity part 2 - System topology and controlling affinity April 16, 2024	Affinity part 1 - Affinity, placement, and order April 16, 2024
Muhammad Osama	2	Deep dive into the MI300 compute and memory partition modes February 09, 2025	Graph analytics on AMD GPUs using Gunrock July 29, 2024
Ryan Swann	2	Accelerating LLM Inference: Up to 3x Speedup on MI300X with Speculative Decoding March 27, 2025	Deep dive into the MI300 compute and memory partition modes February 09, 2025
Karthik Sangaiah	2	Accelerating LLM Inference: Up to 3x Speedup on MI300X with Speculative Decoding March 27, 2025	Deep dive into the MI300 compute and memory partition modes February 09, 2025
Sonali Singh	2	Accelerating LLM Inference: Up to 3x Speedup on MI300X with Speculative Decoding March 27, 2025	Deep dive into the MI300 compute and memory partition modes February 09, 2025
Ganesh Dasika	2	Accelerating LLM Inference: Up to 3x Speedup on MI300X with Speculative Decoding March 27, 2025	Deep dive into the MI300 compute and memory partition modes February 09, 2025
Nhat Vo	2	ORBIT-2 based Weather and Climate Downscaling and Downscaled Global Forecasts on AMD Instinct June 08, 2026	Utilizing AMD Instinct GPU Accelerators for Weather and Precipitation Forecasting with NeuralGCM March 19, 2026
Sarunas Kalade	2	Building Robotics Applications with Ryzen AI and ROS 2 February 09, 2026	Fine-tuning Robotics Vision Language Action Models with AMD ROCm and LeRobot July 14, 2025
Graham Schelle	2	Building Robotics Applications with Ryzen AI and ROS 2 February 09, 2026	Fine-tuning Robotics Vision Language Action Models with AMD ROCm and LeRobot July 14, 2025
Aditya Kumar Singh	2	GEAK V3: Agent-Driven, Repository-Level GPU Kernel Optimization across HIP, Triton, and FlyDSL on AMD GPUs July 20, 2026	Instella-VL-1B: First AMD Vision Language Model March 07, 2025
Diptorup Deb	2	FlashInfer on ROCm: High‑Throughput Prefill Attention via AITER April 06, 2026	Enabling FlashInfer on ROCm for Accelerated LLM Serving October 01, 2025
Joaquin Rives Gambin	2	Medical Imaging on MI300X: SwinUNETR Inference Optimization December 10, 2025	Medical Imaging on MI300X: Optimized SwinUNETR for Tumor Detection October 07, 2025
Vasumathi Neralla	2	Medical Imaging on MI300X: Optimized SwinUNETR for Tumor Detection October 07, 2025	Optimizing Drug Discovery Tools on AMD MI300X Part 2: 3D Molecular Generation with SemlaFlow October 03, 2025
Sudhir Kylasa	2	Technical Dive into AMD's MLPerf Training v6.0 Submission June 16, 2026	Reproducing AMD MLPerf Training v6.0 Submission Result June 16, 2026
Umang Pandey	2	GEAK V3: Agent-Driven, Repository-Level GPU Kernel Optimization across HIP, Triton, and FlyDSL on AMD GPUs July 20, 2026	GEAK-Triton v2 Family of AI Agents: Kernel Optimization for AMD Instinct GPUs December 23, 2025
HongTao Meng	2	GEAK V3: Agent-Driven, Repository-Level GPU Kernel Optimization across HIP, Triton, and FlyDSL on AMD GPUs July 20, 2026	Streamlining Recommendation Model Training on AMD Instinct™ GPUs March 02, 2026
Balazs Toth	2	GEAK V3: Agent-Driven, Repository-Level GPU Kernel Optimization across HIP, Triton, and FlyDSL on AMD GPUs July 20, 2026	Wan2.2 Fine-Tuning: Tailoring an Advanced Video Generation Model on a Single GPU August 19, 2025
Bryan Varble	2	AMD Instinct™ Network Traffic, Congestion Trends, and Harmonics in Scale-Out Networks for AI Training Clusters July 09, 2026	Comparative Analysis of Scale-Out RoCE Network Traffic Patterns and Loads in Training Large Language Models June 18, 2026
Juho Kerttula	2	Fine-Tune LLMs for Proteins with AMD Enterprise AI Suite November 27, 2025	AMD Inference Microservice (AIM): Production Ready Inference on AMD Instinct™ GPUs November 17, 2025
Levent Guner	2	VLM Fine-Tuning for Robotics on AMD Enterprise AI Suite November 28, 2025	Fine-Tune LLMs for Proteins with AMD Enterprise AI Suite November 27, 2025
James E. T. Smith	2	DGL in Depth: SE(3)-Transformer on ROCm 7 December 05, 2025	DGL in the Real World: Running GNNs on Real Use Cases August 20, 2025
Takashi Isobe	2	Triton-Based Optimization of Video Sparse Attention on ROCm July 13, 2026	AMD Hummingbird Image to Video: A Lightweight Feedback-Driven Model for Efficient Image-to-Video Generation August 03, 2025
Chaitanya Manem	2	Building a State-of-the-Art 32 Billion Reasoning Model with Only Synthetic Data on AMD GPUs December 06, 2025	Introducing Instella: New State-of-the-art Fully Open 3B Language Models March 05, 2025
Nick Romero	2	PyTorch Offline Tuning with TunableOp February 24, 2026	Empowering Developers to Build a Robust PyTorch Ecosystem on AMD ROCm™ with Better Insights and Monitoring October 21, 2025
Hai Xiao	2	Supercharge DeepSeek-R1 Inference on AMD Instinct MI300X March 21, 2025	SGLang: Fast Serving Framework for Large Language and Vision-Language Models on AMD Instinct GPUs November 13, 2024
Daniel Warna	2	Training AI Weather Forecasting Models on AMD Instinct November 10, 2025	Running SOTA AI-based Weather Forecasting models on AMD Instinct September 18, 2025
Zhenhua Liu	2	Athena-PRM: Enhancing Multimodal Reasoning with Data-Efficient Process Reward Models January 12, 2026	Introducing AMD EVLM: Efficient Vision-Language Models with Parameter-Space Visual Conditioning August 22, 2025
Chengjia Huang	2	Local Image and Video Generation on AMD Ryzen™ AI Max+ Processor (Windows) July 14, 2026	AI Inference on AMD Ryzen™ AI Max Processor May 25, 2026
Hao Chen	2	Day-0 Support for the SGLang-Native RL Framework - slime on AMD Instinct™ GPUs September 25, 2025	Instella-T2I: Open-Source Text-to-Image with 1D Tokenizer and 32× Token Reduction on AMD GPUs  July 15, 2025
Noah Monti	2	Utilizing AMD Schola and UnrealRoboticsLab with AMD ROCm™ Software to Train a Robotic Arm June 17, 2026	Training a Robotic Arm Using MuJoCo and JAX on AMD Hardware with ROCm™ March 31, 2026
Michael Zhang	2	SGLang: Fast Serving Framework for Large Language and Vision-Language Models on AMD Instinct GPUs November 13, 2024	CTranslate2: Efficient Inference with Transformer Models on AMD GPUs October 24, 2024
Alexander Finn	2	AMD Enterprise AI Suite: Open Infrastructure for Production AI November 17, 2025	AMD Inference Microservice (AIM): Production Ready Inference on AMD Instinct™ GPUs November 17, 2025
Gulsum Gudukbay Akbulut	2	ROCm Fork of MaxText: Structure and Strategy January 06, 2026	ROCm MaxText Testing — Decoupled (Offline) and Cloud-Integrated Modes January 06, 2026
Kai Hakala	2	Enabling Language-specific Reasoning in Multilingual Models with Reinforcement Learning July 24, 2026	Continued Pretraining: A Practical Playbook for Language-Specific LLM Adaptation June 18, 2025
Jouni Luoma	2	Enabling Language-specific Reasoning in Multilingual Models with Reinforcement Learning July 24, 2026	Continued Pretraining: A Practical Playbook for Language-Specific LLM Adaptation June 18, 2025
Elaine Zosa	2	Enabling Language-specific Reasoning in Multilingual Models with Reinforcement Learning July 24, 2026	Continued Pretraining: A Practical Playbook for Language-Specific LLM Adaptation June 18, 2025
Mika Koistinen	2	Enabling Language-specific Reasoning in Multilingual Models with Reinforcement Learning July 24, 2026	Continued Pretraining: A Practical Playbook for Language-Specific LLM Adaptation June 18, 2025
Jonathan Burdge	2	Enabling Language-specific Reasoning in Multilingual Models with Reinforcement Learning July 24, 2026	Continued Pretraining: A Practical Playbook for Language-Specific LLM Adaptation June 18, 2025
Paul Bauer	2	GROMACS Performance on AMD Instinct MI355X March 13, 2026	Installing AMD HIP-Enabled GROMACS on HPC Systems: A LUMI Supercomputer Case Study January 12, 2026
Subhajit Dutta Chowdhury	2	Enabling Speculative Speculative Decoding on MI300X May 29, 2026	Optimizing LLM Workloads: AMD Instinct MI355X GPUs Drive Competitive Performance December 02, 2025
Alexander Aurell	2	Solution Blueprints: Accelerating AI Deployment with AMD Enterprise AI February 11, 2026	AMD Inference Microservice (AIM): Production Ready Inference on AMD Instinct™ GPUs November 17, 2025
Saroosh Shabbir	2	Solution Blueprints: Accelerating AI Deployment with AMD Enterprise AI February 11, 2026	Retrieval Augmented Generation (RAG) with vLLM, LangChain and Chroma November 04, 2025
Yamini Preethi Kamisetty	2	AMD Instinct™ GPUs MLPerf Inference v6.0 Submission April 01, 2026	Reproducing the AMD MLPerf Inference v6.0 Submission Result April 01, 2026
Jesus Carabano Bravo	2	AMD Instinct™ GPUs MLPerf Inference v6.0 Submission April 01, 2026	Reproducing the AMD MLPerf Inference v6.0 Submission Result April 01, 2026
Mikko Lauri	2	AMD Instinct™ GPUs MLPerf Inference v6.0 Submission April 01, 2026	Reproducing the AMD MLPerf Inference v6.0 Submission Result April 01, 2026
Steve Reinhardt	2	Streamlining Recommendation Model Training on AMD Instinct™ GPUs March 02, 2026	Breaking the Accuracy-Speed Barrier: How MXFP4/6 Quantization Revolutionizes Image and Video Generation January 07, 2026
Yonatan Dukler	2	Introducing Instella-MoE: A State-of-the-Art Fully Open Mixture-of-Experts Language Model July 24, 2026	Accelerating Mixture-of-Experts Execution with FarSkip-Collective Models May 05, 2026
Guihong Li	2	Accelerating Mixture-of-Experts Execution with FarSkip-Collective Models May 05, 2026	AMD-HybridLM: Towards Extremely Efficient Hybrid Language Models September 17, 2025
Han Lin	2	hipBLASLt Online GEMM Tuning March 19, 2026	Day 0 Developer Guide: hipBLASLt Offline GEMM Tuning Script November 05, 2025
Zhaodong Bing	2	Out-of-the-Box ROLL Support on AMD GPUs: Accelerating Reinforcement Learning at Scale June 01, 2026	Accelerating Autonomous Driving Model Training on AMD ROCm™ Software December 08, 2025
Teemu Karkkainen	2	VLM Fine-Tuning for Robotics on AMD Enterprise AI Suite November 28, 2025	AMD Inference Microservice (AIM): Production Ready Inference on AMD Instinct™ GPUs November 17, 2025
Justin Chu	2	Building Robotics Applications with Ryzen AI and ROS 2 February 09, 2026	STX-B0T: Real-time AI Robot Assistant Powered by RyzenAI and ROCm October 23, 2025
Vivian Cheng	2	Building Robotics Applications with Ryzen AI and ROS 2 February 09, 2026	STX-B0T: Real-time AI Robot Assistant Powered by RyzenAI and ROCm October 23, 2025
Danny Guan	2	ROCm 7.0: An AI-Ready Powerhouse for Performance, Efficiency, and Productivity September 16, 2025	ROCm Gets Modular: Meet the Instinct Datacenter GPU Driver April 11, 2025
Aditya Bhattacharji	2	ROCm 7.0: An AI-Ready Powerhouse for Performance, Efficiency, and Productivity September 16, 2025	ROCm 6.4: Breaking Barriers in AI, HPC, and Modular GPU Software April 11, 2025
Layla Frischman	2	ROCm 7.14: TheRock Goes Production and Expands AMD's AI Software Platform July 15, 2026	ROCm 7.13: Expanding Hardware, Tools, and Reach May 20, 2026
Pratik Mishra	2	Elevate Your LLM Inference: Autoscaling with Ray, ROCm 7.0.0, and SkyPilot February 13, 2026	Democratizing AI Compute with AMD Using SkyPilot November 13, 2025
Deepan Sekar	2	Accelerating llama.cpp on AMD Instinct MI300X December 11, 2025	Llama.cpp Meets Instinct: A New Era of Open-Source AI Acceleration September 09, 2025
Pei Zhang	2	Accelerating llama.cpp on AMD Instinct MI300X December 11, 2025	Llama.cpp Meets Instinct: A New Era of Open-Source AI Acceleration September 09, 2025
Matthias Reso	1	Chain-of-Thought Guided Visual Reasoning Using Llama 3.2 on a Single AMD Instinct MI300X GPU July 21, 2025	Chain-of-Thought Guided Visual Reasoning Using Llama 3.2 on a Single AMD Instinct MI300X GPU July 21, 2025
Zhu Shan	1	Fine-Tuning LLMs with GRPO on AMD MI300X: Scalable RLHF with Hugging Face TRL and ROCm June 18, 2025	Fine-Tuning LLMs with GRPO on AMD MI300X: Scalable RLHF with Hugging Face TRL and ROCm June 18, 2025
Lihuan Zhang	1	MoE Training Best Practices on AMD GPUs December 16, 2025	MoE Training Best Practices on AMD GPUs December 16, 2025
Ruibin Zhang	1	MoE Training Best Practices on AMD GPUs December 16, 2025	MoE Training Best Practices on AMD GPUs December 16, 2025
Kyle Zhao	1	MoE Training Best Practices on AMD GPUs December 16, 2025	MoE Training Best Practices on AMD GPUs December 16, 2025
Vinay Joshi	1	GEAK: Introducing Triton Kernel AI Agent & Evaluation Benchmarks August 01, 2025	GEAK: Introducing Triton Kernel AI Agent & Evaluation Benchmarks August 01, 2025
Ruturaj Kiran Vaidya	1	JAX-AITER: Bringing AMD’s Optimized AI Kernels to JAX on ROCm™ February 24, 2026	JAX-AITER: Bringing AMD’s Optimized AI Kernels to JAX on ROCm™ February 24, 2026
Nicholas Curtis	1	Register pressure in AMD CDNA™2 GPUs May 17, 2023	Register pressure in AMD CDNA™2 GPUs May 17, 2023
Garrett Byrd	1	Installing ROCm from source with Spack April 14, 2025	Installing ROCm from source with Spack April 14, 2025
Joseph Schoonover	1	Installing ROCm from source with Spack April 14, 2025	Installing ROCm from source with Spack April 14, 2025
Yutao Xu	1	Accelerating LLM Inference on AMD GPUs with Low-Latency GEMMs June 29, 2026	Accelerating LLM Inference on AMD GPUs with Low-Latency GEMMs June 29, 2026
Kiran Thumma	1	Getting Started with FlyDSL Nightly Wheels on ROCm April 20, 2026	Getting Started with FlyDSL Nightly Wheels on ROCm April 20, 2026
Satya Ramji Ainapurapu	1	Getting Started with FlyDSL Nightly Wheels on ROCm April 20, 2026	Getting Started with FlyDSL Nightly Wheels on ROCm April 20, 2026
Jiahui Cao	1	FP8 GEMM Optimization on AMD CDNA™4 Architecture March 10, 2026	FP8 GEMM Optimization on AMD CDNA™4 Architecture March 10, 2026
Corbin Robeck	1	Reading AMD GPU ISA May 13, 2024	Reading AMD GPU ISA May 13, 2024
Sundara Murthy Gurunathan	1	Introducing the AMD Network Operator v1.0.0: Simplifying High-Performance Networking for AMD Platforms January 08, 2026	Introducing the AMD Network Operator v1.0.0: Simplifying High-Performance Networking for AMD Platforms January 08, 2026
Yuvarani Shankar	1	Introducing the AMD Network Operator v1.0.0: Simplifying High-Performance Networking for AMD Platforms January 08, 2026	Introducing the AMD Network Operator v1.0.0: Simplifying High-Performance Networking for AMD Platforms January 08, 2026
Kyle Wang	1	Unleash Full GPU Potential: Overlap Communication and Computation with Triton-Distributed May 06, 2025	Unleash Full GPU Potential: Overlap Communication and Computation with Triton-Distributed May 06, 2025
Nowy Condro,	1	Leveraging AMD AI Workbench and Autoscaling to Scale LLM Inference for Optimal Resource Utilization March 31, 2026	Leveraging AMD AI Workbench and Autoscaling to Scale LLM Inference for Optimal Resource Utilization March 31, 2026
Janet Tseng	1	ROCm 7.9 Technology Preview: ROCm Core SDK and TheRock Build System October 20, 2025	ROCm 7.9 Technology Preview: ROCm Core SDK and TheRock Build System October 20, 2025
Scott Todd	1	ROCm 7.9 Technology Preview: ROCm Core SDK and TheRock Build System October 20, 2025	ROCm 7.9 Technology Preview: ROCm Core SDK and TheRock Build System October 20, 2025
Chris Sosa	1	ROCm 7.9 Technology Preview: ROCm Core SDK and TheRock Build System October 20, 2025	ROCm 7.9 Technology Preview: ROCm Core SDK and TheRock Build System October 20, 2025
Shiv Tyagi	1	Spur: Modern GPU Job Scheduling for HPC and AI Workloads July 22, 2026	Spur: Modern GPU Job Scheduling for HPC and AI Workloads July 22, 2026
Sudheendra Gopinath	1	Spur: Modern GPU Job Scheduling for HPC and AI Workloads July 22, 2026	Spur: Modern GPU Job Scheduling for HPC and AI Workloads July 22, 2026
Tomas Saaristola	1	Adapting AIM LLMs For Specific Use Cases Through Fine-Tuning in AMD AI Workbench June 03, 2026	Adapting AIM LLMs For Specific Use Cases Through Fine-Tuning in AMD AI Workbench June 03, 2026
Aku Rouhe	1	Adapting AIM LLMs For Specific Use Cases Through Fine-Tuning in AMD AI Workbench June 03, 2026	Adapting AIM LLMs For Specific Use Cases Through Fine-Tuning in AMD AI Workbench June 03, 2026
David Doscher	1	AMD ROCm™ installation January 26, 2023	AMD ROCm™ installation January 26, 2023
Evan Masters	1	Measuring Max-Achievable FLOPs – Part 2 February 28, 2025	Measuring Max-Achievable FLOPs – Part 2 February 28, 2025
Babak Poursartip	1	Measuring Max-Achievable FLOPs – Part 2 February 28, 2025	Measuring Max-Achievable FLOPs – Part 2 February 28, 2025
Lihuan Zhang	1	Primus-Pipeline: A More Flexible and Scalable Pipeline Parallelism Implementation February 23, 2026	Primus-Pipeline: A More Flexible and Scalable Pipeline Parallelism Implementation February 23, 2026
Yuechao Guo,Yajie Zhang,Lirong Zhang,Zhen Wan,Jiaoliang Yu,Zufa Yu,Hattie Wu,Lingpeng Jin,Carlus Huang,Andy Luo,Peng Sun,Barsoum Emad	1	ATOMesh: Unlocking AMD Hardware for Scalable LLM Serving June 16, 2026	ATOMesh: Unlocking AMD Hardware for Scalable LLM Serving June 16, 2026
Pedram Alizadeh	1	Understanding RCCL Bandwidth and xGMI Performance on AMD Instinct™ MI300X March 02, 2025	Understanding RCCL Bandwidth and xGMI Performance on AMD Instinct™ MI300X March 02, 2025
Gilbert Lee	1	Understanding RCCL Bandwidth and xGMI Performance on AMD Instinct™ MI300X March 02, 2025	Understanding RCCL Bandwidth and xGMI Performance on AMD Instinct™ MI300X March 02, 2025
Abhishek Patil	1	Unlocking GPU-Accelerated Containers with the AMD Container Toolkit July 03, 2025	Unlocking GPU-Accelerated Containers with the AMD Container Toolkit July 03, 2025
Aleksa Arsic	1	Understanding Attention Algorithms and Their Backends for Image and Video Generation July 20, 2026	Understanding Attention Algorithms and Their Backends for Image and Video Generation July 20, 2026
Filip Jankovic	1	Understanding Attention Algorithms and Their Backends for Image and Video Generation July 20, 2026	Understanding Attention Algorithms and Their Backends for Image and Video Generation July 20, 2026
Liang Shen	1	Understanding Attention Algorithms and Their Backends for Image and Video Generation July 20, 2026	Understanding Attention Algorithms and Their Backends for Image and Video Generation July 20, 2026
Hosang Yoon	1	Understanding Attention Algorithms and Their Backends for Image and Video Generation July 20, 2026	Understanding Attention Algorithms and Their Backends for Image and Video Generation July 20, 2026
Warren Eng	1	Running ComfyUI in Windows with ROCm on WSL August 07, 2025	Running ComfyUI in Windows with ROCm on WSL August 07, 2025
Alireza Sariaslani	1	GPU Partitioning Made Easy: Pack More AI Workloads Using AMD GPU Operator October 01, 2025	GPU Partitioning Made Easy: Pack More AI Workloads Using AMD GPU Operator October 01, 2025
David Silverstone	1	LLM Inference Optimization Using AMD GPU Partitioning January 22, 2026	LLM Inference Optimization Using AMD GPU Partitioning January 22, 2026
Bill Ku	1	Building and Deploying Custom hipBLASLt Libraries on AMD Instinct GPUs June 18, 2026	Building and Deploying Custom hipBLASLt Libraries on AMD Instinct GPUs June 18, 2026
Cheng Ling	1	SmoothQuant model inference on AMD Instinct MI300X using Composable Kernel May 31, 2024	SmoothQuant model inference on AMD Instinct MI300X using Composable Kernel May 31, 2024
Rene Van Oostrum	1	AMD matrix cores November 14, 2022	AMD matrix cores November 14, 2022
Nicholas Malaya	1	AMD matrix cores November 14, 2022	AMD matrix cores November 14, 2022
Daniel Velicka	1	AMD matrix cores November 14, 2022	AMD matrix cores November 14, 2022
Nitish Bhat	1	Reimagining GPU Allocation in Kubernetes: Introducing the AMD GPU DRA Driver January 13, 2026	Reimagining GPU Allocation in Kubernetes: Introducing the AMD GPU DRA Driver January 13, 2026
Brian Chang	1	RDC and RocProfiler Compared to DCGM for Commonly Used Metrics July 07, 2026	RDC and RocProfiler Compared to DCGM for Commonly Used Metrics July 07, 2026
Dmitrii Galantsev	1	RDC and RocProfiler Compared to DCGM for Commonly Used Metrics July 07, 2026	RDC and RocProfiler Compared to DCGM for Commonly Used Metrics July 07, 2026
Giovanni Lenzi Baraldi	1	RDC and RocProfiler Compared to DCGM for Commonly Used Metrics July 07, 2026	RDC and RocProfiler Compared to DCGM for Commonly Used Metrics July 07, 2026
Benjamin Welton	1	RDC and RocProfiler Compared to DCGM for Commonly Used Metrics July 07, 2026	RDC and RocProfiler Compared to DCGM for Commonly Used Metrics July 07, 2026
Gabriel Weisz	1	TraceLens: Democratizing AI Performance Analysis April 27, 2026	TraceLens: Democratizing AI Performance Analysis April 27, 2026
Spandan More	1	TraceLens: Democratizing AI Performance Analysis April 27, 2026	TraceLens: Democratizing AI Performance Analysis April 27, 2026
Steven K. Reinhardt	1	TraceLens: Democratizing AI Performance Analysis April 27, 2026	TraceLens: Democratizing AI Performance Analysis April 27, 2026
Hyukjoon Lee	1	vLLM V1 Meets AMD Instinct GPUs: A New Era for LLM Inference Performance July 07, 2025	vLLM V1 Meets AMD Instinct GPUs: A New Era for LLM Inference Performance July 07, 2025
Yuechao Guo	1	SGLang-ATOM: Bring ROCm-Native Acceleration to SGLang Serving July 08, 2026	SGLang-ATOM: Bring ROCm-Native Acceleration to SGLang Serving July 08, 2026
Qianyun Chu	1	SGLang-ATOM: Bring ROCm-Native Acceleration to SGLang Serving July 08, 2026	SGLang-ATOM: Bring ROCm-Native Acceleration to SGLang Serving July 08, 2026
Zhiwei Yan	1	SGLang-ATOM: Bring ROCm-Native Acceleration to SGLang Serving July 08, 2026	SGLang-ATOM: Bring ROCm-Native Acceleration to SGLang Serving July 08, 2026
Zhen Wan	1	SGLang-ATOM: Bring ROCm-Native Acceleration to SGLang Serving July 08, 2026	SGLang-ATOM: Bring ROCm-Native Acceleration to SGLang Serving July 08, 2026
Ling Zhang	1	SGLang-ATOM: Bring ROCm-Native Acceleration to SGLang Serving July 08, 2026	SGLang-ATOM: Bring ROCm-Native Acceleration to SGLang Serving July 08, 2026
Wen Xie ,Yao Fu	1	Primus: A Lightweight, Unified Training Framework for Large Models on AMD GPUs August 22, 2025	Primus: A Lightweight, Unified Training Framework for Large Models on AMD GPUs August 22, 2025
Huidong Ji	1	Serving NVFP4 Models on AMD Instinct™ MI355 Accelerators July 13, 2026	Serving NVFP4 Models on AMD Instinct™ MI355 Accelerators July 13, 2026
Jiaxin Wang	1	Serving NVFP4 Models on AMD Instinct™ MI355 Accelerators July 13, 2026	Serving NVFP4 Models on AMD Instinct™ MI355 Accelerators July 13, 2026
Sharareh Younesian	1	AgentKernelArena: Benchmarking AI Coding Agents for GPU Kernel Optimization on AMD Instinct GPUs July 03, 2026	AgentKernelArena: Benchmarking AI Coding Agents for GPU Kernel Optimization on AMD Instinct GPUs July 03, 2026
Wenwen Ouyang	1	AgentKernelArena: Benchmarking AI Coding Agents for GPU Kernel Optimization on AMD Instinct GPUs July 03, 2026	AgentKernelArena: Benchmarking AI Coding Agents for GPU Kernel Optimization on AMD Instinct GPUs July 03, 2026
Sina Rafati	1	AgentKernelArena: Benchmarking AI Coding Agents for GPU Kernel Optimization on AMD Instinct GPUs July 03, 2026	AgentKernelArena: Benchmarking AI Coding Agents for GPU Kernel Optimization on AMD Instinct GPUs July 03, 2026
Sharon Zhou	1	AgentKernelArena: Benchmarking AI Coding Agents for GPU Kernel Optimization on AMD Instinct GPUs July 03, 2026	AgentKernelArena: Benchmarking AI Coding Agents for GPU Kernel Optimization on AMD Instinct GPUs July 03, 2026
and Emad Barsoum	1	AgentKernelArena: Benchmarking AI Coding Agents for GPU Kernel Optimization on AMD Instinct GPUs July 03, 2026	AgentKernelArena: Benchmarking AI Coding Agents for GPU Kernel Optimization on AMD Instinct GPUs July 03, 2026
David Prescott	1	Getting Started with AMD Resource Manager: Efficient Sharing of AMD Instinct™ GPUs for R&D Teams and AI Practitioners February 24, 2026	Getting Started with AMD Resource Manager: Efficient Sharing of AMD Instinct™ GPUs for R&D Teams and AI Practitioners February 24, 2026
Anton Smirnov	1	Programming AMD GPUs with Julia April 16, 2024	Programming AMD GPUs with Julia April 16, 2024
Xi Zhao	1	MaxText-Slurm: Production-Grade LLM Training with Built-In Observability March 02, 2026	MaxText-Slurm: Production-Grade LLM Training with Built-In Observability March 02, 2026
Frank Wang	1	MaxText-Slurm: Production-Grade LLM Training with Built-In Observability March 02, 2026	MaxText-Slurm: Production-Grade LLM Training with Built-In Observability March 02, 2026
Yaoming Mu	1	MaxText-Slurm: Production-Grade LLM Training with Built-In Observability March 02, 2026	MaxText-Slurm: Production-Grade LLM Training with Built-In Observability March 02, 2026
Angela Wang	1	MaxText-Slurm: Production-Grade LLM Training with Built-In Observability March 02, 2026	MaxText-Slurm: Production-Grade LLM Training with Built-In Observability March 02, 2026
Alex Voicu	1	C++17 parallel algorithms and HIPSTDPAR # April 18, 2024	C++17 parallel algorithms and HIPSTDPAR # April 18, 2024
Mark Wevers	1	Deploy an Imaging AMD Solution Blueprint on AMD Radeon™ GPUs July 22, 2026	Deploy an Imaging AMD Solution Blueprint on AMD Radeon™ GPUs July 22, 2026
Praveen Kumar Shanmugam	1	Deploy an Imaging AMD Solution Blueprint on AMD Radeon™ GPUs July 22, 2026	Deploy an Imaging AMD Solution Blueprint on AMD Radeon™ GPUs July 22, 2026
Jarkko Lehtiranta	1	Multi-Node Distributed Inference for Diffusion Models with xDiT March 18, 2026	Multi-Node Distributed Inference for Diffusion Models with xDiT March 18, 2026
Tero Kemppi	1	Multi-Node Distributed Inference for Diffusion Models with xDiT March 18, 2026	Multi-Node Distributed Inference for Diffusion Models with xDiT March 18, 2026
Rony Leppanen	1	Multi-Node Distributed Inference for Diffusion Models with xDiT March 18, 2026	Multi-Node Distributed Inference for Diffusion Models with xDiT March 18, 2026
Johanna Malinen	1	Multi-Node Distributed Inference for Diffusion Models with xDiT March 18, 2026	Multi-Node Distributed Inference for Diffusion Models with xDiT March 18, 2026
Chia Hung	1	GEMM Tuning within hipBLASLt– Part 2 October 09, 2025	GEMM Tuning within hipBLASLt– Part 2 October 09, 2025
Charles Boyd	1	Introducing hipThreads: A C++ - Style Concurrency Library for AMD GPUs February 19, 2026	Introducing hipThreads: A C++ - Style Concurrency Library for AMD GPUs February 19, 2026
Kelvin Lui	1	Introducing hipThreads: A C++ - Style Concurrency Library for AMD GPUs February 19, 2026	Introducing hipThreads: A C++ - Style Concurrency Library for AMD GPUs February 19, 2026
Daniel Mcintosh	1	Introducing hipThreads: A C++ - Style Concurrency Library for AMD GPUs February 19, 2026	Introducing hipThreads: A C++ - Style Concurrency Library for AMD GPUs February 19, 2026
Marko Savic	1	Introducing hipThreads: A C++ - Style Concurrency Library for AMD GPUs February 19, 2026	Introducing hipThreads: A C++ - Style Concurrency Library for AMD GPUs February 19, 2026
Gregory Shtrasberg	1	ROCm Becomes a First-Class Platform in the vLLM Ecosystem January 21, 2026	ROCm Becomes a First-Class Platform in the vLLM Ecosystem January 21, 2026
Simon Mo	1	ROCm Becomes a First-Class Platform in the vLLM Ecosystem January 21, 2026	ROCm Becomes a First-Class Platform in the vLLM Ecosystem January 21, 2026
Alexandru Voicu	1	SPIR-V on ROCm: A Portable IR for AMD GPUs July 20, 2026	SPIR-V on ROCm: A Portable IR for AMD GPUs July 20, 2026
Mark Searles	1	SPIR-V on ROCm: A Portable IR for AMD GPUs July 20, 2026	SPIR-V on ROCm: A Portable IR for AMD GPUs July 20, 2026
Lakhinder Walia	1	SPIR-V on ROCm: A Portable IR for AMD GPUs July 20, 2026	SPIR-V on ROCm: A Portable IR for AMD GPUs July 20, 2026
James Brodman	1	SPIR-V on ROCm: A Portable IR for AMD GPUs July 20, 2026	SPIR-V on ROCm: A Portable IR for AMD GPUs July 20, 2026
Yajie Zhang	1	Scaling MiniMax-M3 Inference with Distributed Serving and Operator Co-Design on AMD Instinct MI355X GPUs July 21, 2026	Scaling MiniMax-M3 Inference with Distributed Serving and Operator Co-Design on AMD Instinct MI355X GPUs July 21, 2026
Yi Gan	1	Scaling MiniMax-M3 Inference with Distributed Serving and Operator Co-Design on AMD Instinct MI355X GPUs July 21, 2026	Scaling MiniMax-M3 Inference with Distributed Serving and Operator Co-Design on AMD Instinct MI355X GPUs July 21, 2026
Kevin Chang	1	From Theory to Kernel: Implement FlashAttention-v2 with CK-Tile May 21, 2025	From Theory to Kernel: Implement FlashAttention-v2 with CK-Tile May 21, 2025
Noah Wolfe	1	Introduction to profiling tools for AMD hardware April 10, 2026	Introduction to profiling tools for AMD hardware April 10, 2026
Pengzhan Zhao	1	Attention Decode on AMD MI450 GPUs: A Gluon Kernel Optimization Guide July 27, 2026	Attention Decode on AMD MI450 GPUs: A Gluon Kernel Optimization Guide July 27, 2026
Jeffrey Byrnes	1	Attention Decode on AMD MI450 GPUs: A Gluon Kernel Optimization Guide July 27, 2026	Attention Decode on AMD MI450 GPUs: A Gluon Kernel Optimization Guide July 27, 2026
Austin Kerbow	1	Attention Decode on AMD MI450 GPUs: A Gluon Kernel Optimization Guide July 27, 2026	Attention Decode on AMD MI450 GPUs: A Gluon Kernel Optimization Guide July 27, 2026
Gilbert Lei	1	Practical, Fault‑Robust Distributed Inference for DeepSeek on AMD MI300X November 12, 2025	Practical, Fault‑Robust Distributed Inference for DeepSeek on AMD MI300X November 12, 2025
Duyi Wang	1	Practical, Fault‑Robust Distributed Inference for DeepSeek on AMD MI300X November 12, 2025	Practical, Fault‑Robust Distributed Inference for DeepSeek on AMD MI300X November 12, 2025
Mingzhi Liu	1	Practical, Fault‑Robust Distributed Inference for DeepSeek on AMD MI300X November 12, 2025	Practical, Fault‑Robust Distributed Inference for DeepSeek on AMD MI300X November 12, 2025
Di Tian	1	Practical, Fault‑Robust Distributed Inference for DeepSeek on AMD MI300X November 12, 2025	Practical, Fault‑Robust Distributed Inference for DeepSeek on AMD MI300X November 12, 2025
Jun Chen	1	Practical, Fault‑Robust Distributed Inference for DeepSeek on AMD MI300X November 12, 2025	Practical, Fault‑Robust Distributed Inference for DeepSeek on AMD MI300X November 12, 2025
Yutong Wu	1	Practical, Fault‑Robust Distributed Inference for DeepSeek on AMD MI300X November 12, 2025	Practical, Fault‑Robust Distributed Inference for DeepSeek on AMD MI300X November 12, 2025
Jiahao Zhou	1	Practical, Fault‑Robust Distributed Inference for DeepSeek on AMD MI300X November 12, 2025	Practical, Fault‑Robust Distributed Inference for DeepSeek on AMD MI300X November 12, 2025
Niko Ma	1	Practical, Fault‑Robust Distributed Inference for DeepSeek on AMD MI300X November 12, 2025	Practical, Fault‑Robust Distributed Inference for DeepSeek on AMD MI300X November 12, 2025
Rasmus Larson	1	Onboard and Deploy Custom Models in AMD AI Workbench July 23, 2026	Onboard and Deploy Custom Models in AMD AI Workbench July 23, 2026
Kevin Wong	1	Onboard and Deploy Custom Models in AMD AI Workbench July 23, 2026	Onboard and Deploy Custom Models in AMD AI Workbench July 23, 2026
Clyde Hoang	1	Onboard and Deploy Custom Models in AMD AI Workbench July 23, 2026	Onboard and Deploy Custom Models in AMD AI Workbench July 23, 2026
Yao Fehlis	1	Creating a PyTorch/TensorFlow code environment on AMD GPUs September 11, 2023	Creating a PyTorch/TensorFlow code environment on AMD GPUs September 11, 2023
Lirong Zhang	1	DP Attention and TBO for DeepSeek-V4 on MI355X June 24, 2026	DP Attention and TBO for DeepSeek-V4 on MI355X June 24, 2026
Tanya Roosta	1	Hyperloom - Autonomous Agentic Inference Optimization for AMD GPUs July 23, 2026	Hyperloom - Autonomous Agentic Inference Optimization for AMD GPUs July 23, 2026
Marilyn Basanta	1	Hyperloom - Autonomous Agentic Inference Optimization for AMD GPUs July 23, 2026	Hyperloom - Autonomous Agentic Inference Optimization for AMD GPUs July 23, 2026
Eugene Katsov	1	Efficient GPU Utilization With Workload Pre-Emption in AMD Resource Manager June 26, 2026	Efficient GPU Utilization With Workload Pre-Emption in AMD Resource Manager June 26, 2026
Bill He,Andy Luo	1	Unleashing AMD Instinct™ MI300X GPUs for LLM Serving: Disaggregating Prefill & Decode with SGLang August 28, 2025	Unleashing AMD Instinct™ MI300X GPUs for LLM Serving: Disaggregating Prefill & Decode with SGLang August 28, 2025
Douglas Hamilton	1	ROCm Runfile Installer Is Here! May 22, 2025	ROCm Runfile Installer Is Here! May 22, 2025
Alexandros Theodoridis	1	OpenXLA and JAX - ROCm Support and the State of CI June 29, 2026	OpenXLA and JAX - ROCm Support and the State of CI June 29, 2026
Charles Hofer	1	OpenXLA and JAX - ROCm Support and the State of CI June 29, 2026	OpenXLA and JAX - ROCm Support and the State of CI June 29, 2026
Leonid Drozdov	1	OpenXLA and JAX - ROCm Support and the State of CI June 29, 2026	OpenXLA and JAX - ROCm Support and the State of CI June 29, 2026
Harsha Havanur Shamsundara	1	OpenXLA and JAX - ROCm Support and the State of CI June 29, 2026	OpenXLA and JAX - ROCm Support and the State of CI June 29, 2026
Chao Chen	1	OpenXLA and JAX - ROCm Support and the State of CI June 29, 2026	OpenXLA and JAX - ROCm Support and the State of CI June 29, 2026
Omkar Kakarparthi	1	OpenXLA and JAX - ROCm Support and the State of CI June 29, 2026	OpenXLA and JAX - ROCm Support and the State of CI June 29, 2026
Indira Vats	1	OpenXLA and JAX - ROCm Support and the State of CI June 29, 2026	OpenXLA and JAX - ROCm Support and the State of CI June 29, 2026
Henning Becker	1	OpenXLA and JAX - ROCm Support and the State of CI June 29, 2026	OpenXLA and JAX - ROCm Support and the State of CI June 29, 2026
Kuy Mainwaring	1	OpenXLA and JAX - ROCm Support and the State of CI June 29, 2026	OpenXLA and JAX - ROCm Support and the State of CI June 29, 2026
Michael Hudgins	1	OpenXLA and JAX - ROCm Support and the State of CI June 29, 2026	OpenXLA and JAX - ROCm Support and the State of CI June 29, 2026
Peter Hawkins	1	OpenXLA and JAX - ROCm Support and the State of CI June 29, 2026	OpenXLA and JAX - ROCm Support and the State of CI June 29, 2026
Rajneesh Bhardwaj	1	Deep dive into the MI300 compute and memory partition modes February 09, 2025	Deep dive into the MI300 compute and memory partition modes February 09, 2025
Onil Gunawardana	1	Introducing AMD ROCm™ Infera: Scaling Goodput for Agentic AI with Distributed Inference Orchestration July 23, 2026	Introducing AMD ROCm™ Infera: Scaling Goodput for Agentic AI with Distributed Inference Orchestration July 23, 2026
Jiejing Zhang	1	Introducing AMD ROCm™ Infera: Scaling Goodput for Agentic AI with Distributed Inference Orchestration July 23, 2026	Introducing AMD ROCm™ Infera: Scaling Goodput for Agentic AI with Distributed Inference Orchestration July 23, 2026
Yingxin Hou	1	Introducing AMD ROCm™ Infera: Scaling Goodput for Agentic AI with Distributed Inference Orchestration July 23, 2026	Introducing AMD ROCm™ Infera: Scaling Goodput for Agentic AI with Distributed Inference Orchestration July 23, 2026
Mahdieh Ghazimirsaeed	1	GPU-aware MPI with ROCm June 08, 2023	GPU-aware MPI with ROCm June 08, 2023
Stephen Bates	1	Introducing ROCm™ AMD Infinity Context: A Purpose-Built KV Cache Tier for Distributed Inference July 22, 2026	Introducing ROCm™ AMD Infinity Context: A Purpose-Built KV Cache Tier for Distributed Inference July 22, 2026
Mohammad Mahdi Kamani	1	Beyond Text: Accelerating Multimodal AI Inference with Speculative Decoding on AMD Instinct™ MI300X GPUs April 28, 2025	Beyond Text: Accelerating Multimodal AI Inference with Speculative Decoding on AMD Instinct™ MI300X GPUs April 28, 2025
Parsa Fashi	1	Beyond Text: Accelerating Multimodal AI Inference with Speculative Decoding on AMD Instinct™ MI300X GPUs April 28, 2025	Beyond Text: Accelerating Multimodal AI Inference with Speculative Decoding on AMD Instinct™ MI300X GPUs April 28, 2025
Brian Pickrell	1	Triton Inference Server with vLLM on AMD GPUs January 08, 2025	Triton Inference Server with vLLM on AMD GPUs January 08, 2025
Abby O'Neill	1	Fine-tuning Robotics Vision Language Action Models with AMD ROCm and LeRobot July 14, 2025	Fine-tuning Robotics Vision Language Action Models with AMD ROCm and LeRobot July 14, 2025
Ken O'Brien	1	Fine-tuning Robotics Vision Language Action Models with AMD ROCm and LeRobot July 14, 2025	Fine-tuning Robotics Vision Language Action Models with AMD ROCm and LeRobot July 14, 2025
Hang Yang	1	Programming Tensor Descriptors in Composable Kernel (CK) March 25, 2026	Programming Tensor Descriptors in Composable Kernel (CK) March 25, 2026
Shreyas Atre	1	MXFP6 and MXFP4 Mixed Precision for Accelerating Dense LLMs on AMD Instinct MI355X June 26, 2026	MXFP6 and MXFP4 Mixed Precision for Accelerating Dense LLMs on AMD Instinct MI355X June 26, 2026
Mohammad Abdul Basit	1	A Step-by-Step Walkthrough of Decentralized LLM Training on AMD GPUs December 18, 2025	A Step-by-Step Walkthrough of Decentralized LLM Training on AMD GPUs December 18, 2025
Subrahmanya Pavankumar Dubagunta	1	GEAK V3: Agent-Driven, Repository-Level GPU Kernel Optimization across HIP, Triton, and FlyDSL on AMD GPUs July 20, 2026	GEAK V3: Agent-Driven, Repository-Level GPU Kernel Optimization across HIP, Triton, and FlyDSL on AMD GPUs July 20, 2026
Puyuan Yang	1	GEAK V3: Agent-Driven, Repository-Level GPU Kernel Optimization across HIP, Triton, and FlyDSL on AMD GPUs July 20, 2026	GEAK V3: Agent-Driven, Repository-Level GPU Kernel Optimization across HIP, Triton, and FlyDSL on AMD GPUs July 20, 2026
Zheng Yao	1	GEAK V3: Agent-Driven, Repository-Level GPU Kernel Optimization across HIP, Triton, and FlyDSL on AMD GPUs July 20, 2026	GEAK V3: Agent-Driven, Repository-Level GPU Kernel Optimization across HIP, Triton, and FlyDSL on AMD GPUs July 20, 2026
Manem Chaitanya	1	GEAK V3: Agent-Driven, Repository-Level GPU Kernel Optimization across HIP, Triton, and FlyDSL on AMD GPUs July 20, 2026	GEAK V3: Agent-Driven, Repository-Level GPU Kernel Optimization across HIP, Triton, and FlyDSL on AMD GPUs July 20, 2026
Johanna Xuyue Yang	1	GEAK V3: Agent-Driven, Repository-Level GPU Kernel Optimization across HIP, Triton, and FlyDSL on AMD GPUs July 20, 2026	GEAK V3: Agent-Driven, Repository-Level GPU Kernel Optimization across HIP, Triton, and FlyDSL on AMD GPUs July 20, 2026
Yashvardhan Agarwal	1	GEAK V3: Agent-Driven, Repository-Level GPU Kernel Optimization across HIP, Triton, and FlyDSL on AMD GPUs July 20, 2026	GEAK V3: Agent-Driven, Repository-Level GPU Kernel Optimization across HIP, Triton, and FlyDSL on AMD GPUs July 20, 2026
Nithya Soundar	1	AMD Instinct™ Network Traffic, Congestion Trends, and Harmonics in Scale-Out Networks for AI Training Clusters July 09, 2026	AMD Instinct™ Network Traffic, Congestion Trends, and Harmonics in Scale-Out Networks for AI Training Clusters July 09, 2026
Rathnakara Malatesha	1	Deploying Serverless AI Inference on AMD GPU Clusters February 25, 2025	Deploying Serverless AI Inference on AMD GPU Clusters February 25, 2025
Bruce Xue	1	Accelerate DeepSeek-R1 Inference: Integrate AITER into SGLang May 16, 2025	Accelerate DeepSeek-R1 Inference: Integrate AITER into SGLang May 16, 2025
Geoffrey C. Martin-Noble	1	DGL in Depth: SE(3)-Transformer on ROCm 7 December 05, 2025	DGL in Depth: SE(3)-Transformer on ROCm 7 December 05, 2025
Tres Popp	1	DGL in Depth: SE(3)-Transformer on ROCm 7 December 05, 2025	DGL in Depth: SE(3)-Transformer on ROCm 7 December 05, 2025
Ivan Tikhonov	1	Building a GPU-Resident YOLO26 Object Detection Pipeline on the AMD Radeon™ AI PRO R9700 GPU July 03, 2026	Building a GPU-Resident YOLO26 Object Detection Pipeline on the AMD Radeon™ AI PRO R9700 GPU July 03, 2026
Aleksandr Suslov	1	Building a GPU-Resident YOLO26 Object Detection Pipeline on the AMD Radeon™ AI PRO R9700 GPU July 03, 2026	Building a GPU-Resident YOLO26 Object Detection Pipeline on the AMD Radeon™ AI PRO R9700 GPU July 03, 2026
Xiao Yu	1	Accelerating Diffusers and xDiT Image Generation with MXFP4 using AMD Quark on AMD Instinct™ MI350 GPUs July 06, 2026	Accelerating Diffusers and xDiT Image Generation with MXFP4 using AMD Quark on AMD Instinct™ MI350 GPUs July 06, 2026
Dong zhou	1	AMD Hummingbird Image to Video: A Lightweight Feedback-Driven Model for Efficient Image-to-Video Generation August 03, 2025	AMD Hummingbird Image to Video: A Lightweight Feedback-Driven Model for Efficient Image-to-Video Generation August 03, 2025
He Cui,Mengmeng Ge,Dong Li,Emad Barsoum	1	AMD Hummingbird Image to Video: A Lightweight Feedback-Driven Model for Efficient Image-to-Video Generation August 03, 2025	AMD Hummingbird Image to Video: A Lightweight Feedback-Driven Model for Efficient Image-to-Video Generation August 03, 2025
Jithun Nair	1	Empowering Developers to Build a Robust PyTorch Ecosystem on AMD ROCm™ with Better Insights and Monitoring October 21, 2025	Empowering Developers to Build a Robust PyTorch Ecosystem on AMD ROCm™ with Better Insights and Monitoring October 21, 2025
Pruthvi Madugundu	1	Empowering Developers to Build a Robust PyTorch Ecosystem on AMD ROCm™ with Better Insights and Monitoring October 21, 2025	Empowering Developers to Build a Robust PyTorch Ecosystem on AMD ROCm™ with Better Insights and Monitoring October 21, 2025
Jagadish Krishnamoorthy	1	Empowering Developers to Build a Robust PyTorch Ecosystem on AMD ROCm™ with Better Insights and Monitoring October 21, 2025	Empowering Developers to Build a Robust PyTorch Ecosystem on AMD ROCm™ with Better Insights and Monitoring October 21, 2025
Srinivasan Subramanian	1	Empowering Developers to Build a Robust PyTorch Ecosystem on AMD ROCm™ with Better Insights and Monitoring October 21, 2025	Empowering Developers to Build a Robust PyTorch Ecosystem on AMD ROCm™ with Better Insights and Monitoring October 21, 2025
Eli Uriegas	1	Empowering Developers to Build a Robust PyTorch Ecosystem on AMD ROCm™ with Better Insights and Monitoring October 21, 2025	Empowering Developers to Build a Robust PyTorch Ecosystem on AMD ROCm™ with Better Insights and Monitoring October 21, 2025
Vin Huang	1	Unlocking Sparse Acceleration on AMD GPUs with hipSPARSELt February 17, 2026	Unlocking Sparse Acceleration on AMD GPUs with hipSPARSELt February 17, 2026
Luise Chen	1	Inferencing with Grok-1 on AMD GPUs August 09, 2024	Inferencing with Grok-1 on AMD GPUs August 09, 2024
Lei Shao	1	Inferencing with Grok-1 on AMD GPUs August 09, 2024	Inferencing with Grok-1 on AMD GPUs August 09, 2024
Akash Haridas	1	Nitro-T: Training a Text-to-Image Diffusion Model from Scratch in 1 Day July 09, 2025	Nitro-T: Training a Text-to-Image Diffusion Model from Scratch in 1 Day July 09, 2025
Aleksei Rechinskii	1	Towards Feature Complete Triton Support in JAX-Triton July 08, 2026	Towards Feature Complete Triton Support in JAX-Triton July 08, 2026
Manoj Rao	1	GEAK-Triton v2 Family of AI Agents: Kernel Optimization for AMD Instinct GPUs December 23, 2025	GEAK-Triton v2 Family of AI Agents: Kernel Optimization for AMD Instinct GPUs December 23, 2025
Yuzhen Zhou	1	Day-0 Support for the SGLang-Native RL Framework - slime on AMD Instinct™ GPUs September 25, 2025	Day-0 Support for the SGLang-Native RL Framework - slime on AMD Instinct™ GPUs September 25, 2025
Jin Pan	1	Day-0 Support for the SGLang-Native RL Framework - slime on AMD Instinct™ GPUs September 25, 2025	Day-0 Support for the SGLang-Native RL Framework - slime on AMD Instinct™ GPUs September 25, 2025
Shizhe Ding	1	Nitro-AR: A Compact AR Transformer for High-Quality Image Generation January 22, 2026	Nitro-AR: A Compact AR Transformer for High-Quality Image Generation January 22, 2026
Chandan Sharma	1	Announcing MONAI 1.0.0 for AMD ROCm: Breakthrough AI Acceleration for Medical Imaging Models on AMD Instinct™ GPUs October 07, 2025	Announcing MONAI 1.0.0 for AMD ROCm: Breakthrough AI Acceleration for Medical Imaging Models on AMD Instinct™ GPUs October 07, 2025
Eliecer Diaz	1	Deploy and Customize AMD Solution Blueprints April 02, 2026	Deploy and Customize AMD Solution Blueprints April 02, 2026
Alex Cann	1	Utilizing AMD Schola and UnrealRoboticsLab with AMD ROCm™ Software to Train a Robotic Arm June 17, 2026	Utilizing AMD Schola and UnrealRoboticsLab with AMD ROCm™ Software to Train a Robotic Arm June 17, 2026
Joe Shajrawi	1	AMD Integrates llm-d on AMD Instinct MI300X Cluster For Distributed LLM Serving May 20, 2025	AMD Integrates llm-d on AMD Instinct MI300X Cluster For Distributed LLM Serving May 20, 2025
Eduardo Alvarez	1	Analyzing the Impact of Tensor Parallelism Configurations on LLM Inference Performance March 14, 2025	Analyzing the Impact of Tensor Parallelism Configurations on LLM Inference Performance March 14, 2025
Nicola Tan	1	AMD Enterprise AI Suite: Open Infrastructure for Production AI November 17, 2025	AMD Enterprise AI Suite: Open Infrastructure for Production AI November 17, 2025
Sebastian Andersson	1	AMD Enterprise AI Suite: Open Infrastructure for Production AI November 17, 2025	AMD Enterprise AI Suite: Open Infrastructure for Production AI November 17, 2025
Brayden Mahdavi	1	AMD Enterprise AI Suite: Open Infrastructure for Production AI November 17, 2025	AMD Enterprise AI Suite: Open Infrastructure for Production AI November 17, 2025
Mathias Lehtinen	1	AMD Enterprise AI Suite: Open Infrastructure for Production AI November 17, 2025	AMD Enterprise AI Suite: Open Infrastructure for Production AI November 17, 2025
Zeping Li	1	Týr-the-Pruner: Search-based Global Structural Pruning for LLMs December 03, 2025	Týr-the-Pruner: Search-based Global Structural Pruning for LLMs December 03, 2025
Ji Liu	1	Týr-the-Pruner: Search-based Global Structural Pruning for LLMs December 03, 2025	Týr-the-Pruner: Search-based Global Structural Pruning for LLMs December 03, 2025
Daniel Zautner	1	Enabling Language-specific Reasoning in Multilingual Models with Reinforcement Learning July 24, 2026	Enabling Language-specific Reasoning in Multilingual Models with Reinforcement Learning July 24, 2026
Adam Hrin	1	Enabling Language-specific Reasoning in Multilingual Models with Reinforcement Learning July 24, 2026	Enabling Language-specific Reasoning in Multilingual Models with Reinforcement Learning July 24, 2026
Maria Barrett	1	Enabling Language-specific Reasoning in Multilingual Models with Reinforcement Learning July 24, 2026	Enabling Language-specific Reasoning in Multilingual Models with Reinforcement Learning July 24, 2026
Sarthak Tandon	1	PyTorch Offline Tuning with TunableOp February 24, 2026	PyTorch Offline Tuning with TunableOp February 24, 2026
Jin Zhou	1	PyTorch Offline Tuning with TunableOp February 24, 2026	PyTorch Offline Tuning with TunableOp February 24, 2026
Takashi Isobe,He Cui,Mengmeng Ge,Dong Zhou,Dong Li,KuanTing Lin,Chandra Yang,Wickey Wang,Emad Barsoum	1	Bridging the Last Mile: Deploying Hummingbird-XT for Efficient Video Generation on AMD Consumer-Grade Platforms January 08, 2026	Bridging the Last Mile: Deploying Hummingbird-XT for Efficient Video Generation on AMD Consumer-Grade Platforms January 08, 2026
Giuseppe Franco	1	Reproducing the AMD Instinct™ GPUs MLPerf Inference v5.0 Submission April 02, 2025	Reproducing the AMD Instinct™ GPUs MLPerf Inference v5.0 Submission April 02, 2025
AMD Quark team	1	Reproducing the AMD Instinct™ GPUs MLPerf Inference v5.0 Submission April 02, 2025	Reproducing the AMD Instinct™ GPUs MLPerf Inference v5.0 Submission April 02, 2025
Yu Geng	1	Micro-World: First AMD Open-Source World Models for Interactive Video Generation February 05, 2026	Micro-World: First AMD Open-Source World Models for Interactive Video Generation February 05, 2026
Wensong Chan	1	Micro-World: First AMD Open-Source World Models for Interactive Video Generation February 05, 2026	Micro-World: First AMD Open-Source World Models for Interactive Video Generation February 05, 2026
Larry Li	1	Accelerating Large-Scale LLM Inference on AMD Instinct MI350X/MI355X with Eagle3 and AMD Quark July 03, 2026	Accelerating Large-Scale LLM Inference on AMD Instinct MI350X/MI355X with Eagle3 and AMD Quark July 03, 2026
Xikai Meng	1	Accelerating Large-Scale LLM Inference on AMD Instinct MI350X/MI355X with Eagle3 and AMD Quark July 03, 2026	Accelerating Large-Scale LLM Inference on AMD Instinct MI350X/MI355X with Eagle3 and AMD Quark July 03, 2026
Haichen Zhang	1	Accelerating Large-Scale LLM Inference on AMD Instinct MI350X/MI355X with Eagle3 and AMD Quark July 03, 2026	Accelerating Large-Scale LLM Inference on AMD Instinct MI350X/MI355X with Eagle3 and AMD Quark July 03, 2026
Inesh Chakrabarti	1	Productionizing TurboQuant on AMD GPUs for KV-Cache-Bound LLM Inference June 11, 2026	Productionizing TurboQuant on AMD GPUs for KV-Cache-Bound LLM Inference June 11, 2026
David Limpus	1	Productionizing TurboQuant on AMD GPUs for KV-Cache-Bound LLM Inference June 11, 2026	Productionizing TurboQuant on AMD GPUs for KV-Cache-Bound LLM Inference June 11, 2026
Aditi Ghai Rana	1	Productionizing TurboQuant on AMD GPUs for KV-Cache-Bound LLM Inference June 11, 2026	Productionizing TurboQuant on AMD GPUs for KV-Cache-Bound LLM Inference June 11, 2026
Thiago Crepaldi	1	Productionizing TurboQuant on AMD GPUs for KV-Cache-Bound LLM Inference June 11, 2026	Productionizing TurboQuant on AMD GPUs for KV-Cache-Bound LLM Inference June 11, 2026
Lovisa Borthas	1	Solution Blueprints: Accelerating AI Deployment with AMD Enterprise AI February 11, 2026	Solution Blueprints: Accelerating AI Deployment with AMD Enterprise AI February 11, 2026
Rebecca Lee	1	AMD Instinct™ GPUs MLPerf Inference v6.0 Submission April 01, 2026	AMD Instinct™ GPUs MLPerf Inference v6.0 Submission April 01, 2026
Jouni Hartikainen	1	Scaling AI Inference Performance with vLLM on AMD Instinct MI355X GPUs December 08, 2025	Scaling AI Inference Performance with vLLM on AMD Instinct MI355X GPUs December 08, 2025
Aarne Talman	1	Scaling AI Inference Performance with vLLM on AMD Instinct MI355X GPUs December 08, 2025	Scaling AI Inference Performance with vLLM on AMD Instinct MI355X GPUs December 08, 2025
Reima Karhila	1	Scaling AI Inference Performance with vLLM on AMD Instinct MI355X GPUs December 08, 2025	Scaling AI Inference Performance with vLLM on AMD Instinct MI355X GPUs December 08, 2025
Teemu Virolainen	1	Scaling AI Inference Performance with vLLM on AMD Instinct MI355X GPUs December 08, 2025	Scaling AI Inference Performance with vLLM on AMD Instinct MI355X GPUs December 08, 2025
Mikko Tukiainen	1	Scaling AI Inference Performance with vLLM on AMD Instinct MI355X GPUs December 08, 2025	Scaling AI Inference Performance with vLLM on AMD Instinct MI355X GPUs December 08, 2025
Bishwo Adhikari	1	Scaling AI Inference Performance with vLLM on AMD Instinct MI355X GPUs December 08, 2025	Scaling AI Inference Performance with vLLM on AMD Instinct MI355X GPUs December 08, 2025
Xavier Aguilar Fruto	1	Scaling AI Inference Performance with vLLM on AMD Instinct MI355X GPUs December 08, 2025	Scaling AI Inference Performance with vLLM on AMD Instinct MI355X GPUs December 08, 2025
Stig-Arne Gronroos	1	Scaling AI Inference Performance with vLLM on AMD Instinct MI355X GPUs December 08, 2025	Scaling AI Inference Performance with vLLM on AMD Instinct MI355X GPUs December 08, 2025
Markus Hartikainen	1	Scaling AI Inference Performance with vLLM on AMD Instinct MI355X GPUs December 08, 2025	Scaling AI Inference Performance with vLLM on AMD Instinct MI355X GPUs December 08, 2025
Mustafa Khalid Masood	1	Scaling AI Inference Performance with vLLM on AMD Instinct MI355X GPUs December 08, 2025	Scaling AI Inference Performance with vLLM on AMD Instinct MI355X GPUs December 08, 2025
Olga Miroshnichenko	1	Scaling AI Inference Performance with vLLM on AMD Instinct MI355X GPUs December 08, 2025	Scaling AI Inference Performance with vLLM on AMD Instinct MI355X GPUs December 08, 2025
Tres Popp	1	Scaling AI Inference Performance with vLLM on AMD Instinct MI355X GPUs December 08, 2025	Scaling AI Inference Performance with vLLM on AMD Instinct MI355X GPUs December 08, 2025
Tuukka Sarvi	1	Scaling AI Inference Performance with vLLM on AMD Instinct MI355X GPUs December 08, 2025	Scaling AI Inference Performance with vLLM on AMD Instinct MI355X GPUs December 08, 2025
Olha Shkaravska	1	Scaling AI Inference Performance with vLLM on AMD Instinct MI355X GPUs December 08, 2025	Scaling AI Inference Performance with vLLM on AMD Instinct MI355X GPUs December 08, 2025
Jin Tao	1	Scaling AI Inference Performance with vLLM on AMD Instinct MI355X GPUs December 08, 2025	Scaling AI Inference Performance with vLLM on AMD Instinct MI355X GPUs December 08, 2025
Matti Varjokallio	1	Scaling AI Inference Performance with vLLM on AMD Instinct MI355X GPUs December 08, 2025	Scaling AI Inference Performance with vLLM on AMD Instinct MI355X GPUs December 08, 2025
Jaakko Vainio	1	Scaling AI Inference Performance with vLLM on AMD Instinct MI355X GPUs December 08, 2025	Scaling AI Inference Performance with vLLM on AMD Instinct MI355X GPUs December 08, 2025
Faisal Azhar	1	Scaling AI Inference Performance with vLLM on AMD Instinct MI355X GPUs December 08, 2025	Scaling AI Inference Performance with vLLM on AMD Instinct MI355X GPUs December 08, 2025
Lei Zhang	1	Resilient Large-Scale Training: Integrating TorchFT with TorchTitan on AMD GPUs February 08, 2026	Resilient Large-Scale Training: Integrating TorchFT with TorchTitan on AMD GPUs February 08, 2026
Xiaofeng Zheng	1	Resilient Large-Scale Training: Integrating TorchFT with TorchTitan on AMD GPUs February 08, 2026	Resilient Large-Scale Training: Integrating TorchFT with TorchTitan on AMD GPUs February 08, 2026
Haishuo Kong	1	Resilient Large-Scale Training: Integrating TorchFT with TorchTitan on AMD GPUs February 08, 2026	Resilient Large-Scale Training: Integrating TorchFT with TorchTitan on AMD GPUs February 08, 2026
Jingxian Wang	1	Resilient Large-Scale Training: Integrating TorchFT with TorchTitan on AMD GPUs February 08, 2026	Resilient Large-Scale Training: Integrating TorchFT with TorchTitan on AMD GPUs February 08, 2026
Sampo Immonen	1	Efficient Hyperparameter Optimization for Autonomous Driving Models with AMD Instinct GPU Partitioning July 08, 2026	Efficient Hyperparameter Optimization for Autonomous Driving Models with AMD Instinct GPU Partitioning July 08, 2026
Muhammad Zain Khawaja	1	Efficient Hyperparameter Optimization for Autonomous Driving Models with AMD Instinct GPU Partitioning July 08, 2026	Efficient Hyperparameter Optimization for Autonomous Driving Models with AMD Instinct GPU Partitioning July 08, 2026
Atanasko Mitrev	1	Efficient Hyperparameter Optimization for Autonomous Driving Models with AMD Instinct GPU Partitioning July 08, 2026	Efficient Hyperparameter Optimization for Autonomous Driving Models with AMD Instinct GPU Partitioning July 08, 2026
Haohui Mai	1	Optimizing FP4 Mixed-Precision Inference with Petit on AMD Instinct MI250 and MI300 GPUs: A Developer’s Perspective October 06, 2025	Optimizing FP4 Mixed-Precision Inference with Petit on AMD Instinct MI250 and MI300 GPUs: A Developer’s Perspective October 06, 2025
AMD Quark Team	1	AMD Instinct™ MI325X GPUs Produce Strong Performance in MLPerf Inference v5.0 April 02, 2025	AMD Instinct™ MI325X GPUs Produce Strong Performance in MLPerf Inference v5.0 April 02, 2025
AMD Brevitas Team	1	AMD Instinct™ MI325X GPUs Produce Strong Performance in MLPerf Inference v5.0 April 02, 2025	AMD Instinct™ MI325X GPUs Produce Strong Performance in MLPerf Inference v5.0 April 02, 2025
and AMD Shark Team	1	AMD Instinct™ MI325X GPUs Produce Strong Performance in MLPerf Inference v5.0 April 02, 2025	AMD Instinct™ MI325X GPUs Produce Strong Performance in MLPerf Inference v5.0 April 02, 2025
Tharun Adithya Srikrishnan	1	Streamlining Recommendation Model Training on AMD Instinct™ GPUs March 02, 2026	Streamlining Recommendation Model Training on AMD Instinct™ GPUs March 02, 2026
Mingyu Yang	1	AMD-HybridLM: Towards Extremely Efficient Hybrid Language Models September 17, 2025	AMD-HybridLM: Towards Extremely Efficient Hybrid Language Models September 17, 2025
Yu Zhou	1	hipBLASLt Online GEMM Tuning March 19, 2026	hipBLASLt Online GEMM Tuning March 19, 2026
Zhou Yu	1	hipBLASLt Online GEMM Tuning March 19, 2026	hipBLASLt Online GEMM Tuning March 19, 2026
Junyan Yang	1	Day 0 Developer Guide: hipBLASLt Offline GEMM Tuning Script November 05, 2025	Day 0 Developer Guide: hipBLASLt Offline GEMM Tuning Script November 05, 2025
He Cui	1	Triton-Based Optimization of Video Sparse Attention on ROCm July 13, 2026	Triton-Based Optimization of Video Sparse Attention on ROCm July 13, 2026
Mengmeng Ge	1	Triton-Based Optimization of Video Sparse Attention on ROCm July 13, 2026	Triton-Based Optimization of Video Sparse Attention on ROCm July 13, 2026
Hisham Chowdhury	1	A Practical Guide to Running LLMs on AMD Radeon™ GPUs June 19, 2026	A Practical Guide to Running LLMs on AMD Radeon™ GPUs June 19, 2026
Owen Zhang	1	A Practical Guide to Running LLMs on AMD Radeon™ GPUs June 19, 2026	A Practical Guide to Running LLMs on AMD Radeon™ GPUs June 19, 2026
Mark Granroth-Wilding	1	AMD-Powered 3D Gaussian Splatting for Autonomous Driving Scenes May 07, 2026	AMD-Powered 3D Gaussian Splatting for Autonomous Driving Scenes May 07, 2026
Yayuan Wang	1	Getting Started with ComfyUI on AMD Radeon™ RX 9000 Series GPUs March 09, 2026	Getting Started with ComfyUI on AMD Radeon™ RX 9000 Series GPUs March 09, 2026
Shenrun Zhang	1	Accelerating LLM Inference: Up to 3x Speedup on MI300X with Speculative Decoding March 27, 2025	Accelerating LLM Inference: Up to 3x Speedup on MI300X with Speculative Decoding March 27, 2025
Ju Huang	1	Out-of-the-Box ROLL Support on AMD GPUs: Accelerating Reinforcement Learning at Scale June 01, 2026	Out-of-the-Box ROLL Support on AMD GPUs: Accelerating Reinforcement Learning at Scale June 01, 2026
Rebecca Li	1	Reproducing the AMD MLPerf Inference v6.0 Submission Result April 01, 2026	Reproducing the AMD MLPerf Inference v6.0 Submission Result April 01, 2026
Zhiquan Chen	1	Engineering Qwen-VL for Production: Vision Module Architecture and Optimization Practices March 24, 2026	Engineering Qwen-VL for Production: Vision Module Architecture and Optimization Practices March 24, 2026
Zhao An	1	Engineering Qwen-VL for Production: Vision Module Architecture and Optimization Practices March 24, 2026	Engineering Qwen-VL for Production: Vision Module Architecture and Optimization Practices March 24, 2026
Emelie Wahlstrom	1	Retrieval Augmented Generation (RAG) with vLLM, LangChain and Chroma November 04, 2025	Retrieval Augmented Generation (RAG) with vLLM, LangChain and Chroma November 04, 2025
Georgy Krivoruchko	1	Building a High-Performance Video Inference Pipeline with ROCm Libraries Using C/C++ July 21, 2026	Building a High-Performance Video Inference Pipeline with ROCm Libraries Using C/C++ July 21, 2026
Huanxuan Liao	1	SparK: Query-Aware Unstructured Sparsity with Recoverable KV Cache Channel Pruning January 02, 2026	SparK: Query-Aware Unstructured Sparsity with Recoverable KV Cache Channel Pruning January 02, 2026
Shizhu He	1	SparK: Query-Aware Unstructured Sparsity with Recoverable KV Cache Channel Pruning January 02, 2026	SparK: Query-Aware Unstructured Sparsity with Recoverable KV Cache Channel Pruning January 02, 2026
Jun Zhao	1	SparK: Query-Aware Unstructured Sparsity with Recoverable KV Cache Channel Pruning January 02, 2026	SparK: Query-Aware Unstructured Sparsity with Recoverable KV Cache Channel Pruning January 02, 2026
Kang Liu	1	SparK: Query-Aware Unstructured Sparsity with Recoverable KV Cache Channel Pruning January 02, 2026	SparK: Query-Aware Unstructured Sparsity with Recoverable KV Cache Channel Pruning January 02, 2026
Andy Allred	1	AMD Inference Microservice (AIM): Production Ready Inference on AMD Instinct™ GPUs November 17, 2025	AMD Inference Microservice (AIM): Production Ready Inference on AMD Instinct™ GPUs November 17, 2025
Aravind Kumar Rao Bappanadu	1	AMD Inference Microservice (AIM): Production Ready Inference on AMD Instinct™ GPUs November 17, 2025	AMD Inference Microservice (AIM): Production Ready Inference on AMD Instinct™ GPUs November 17, 2025
Thomas Bergstrom	1	AMD Inference Microservice (AIM): Production Ready Inference on AMD Instinct™ GPUs November 17, 2025	AMD Inference Microservice (AIM): Production Ready Inference on AMD Instinct™ GPUs November 17, 2025
Sander Bijl de Vroe	1	AMD Inference Microservice (AIM): Production Ready Inference on AMD Instinct™ GPUs November 17, 2025	AMD Inference Microservice (AIM): Production Ready Inference on AMD Instinct™ GPUs November 17, 2025
Stanislau Fink	1	AMD Inference Microservice (AIM): Production Ready Inference on AMD Instinct™ GPUs November 17, 2025	AMD Inference Microservice (AIM): Production Ready Inference on AMD Instinct™ GPUs November 17, 2025
Mark van Heeswijk	1	AMD Inference Microservice (AIM): Production Ready Inference on AMD Instinct™ GPUs November 17, 2025	AMD Inference Microservice (AIM): Production Ready Inference on AMD Instinct™ GPUs November 17, 2025
Andrey Ivannikov	1	AMD Inference Microservice (AIM): Production Ready Inference on AMD Instinct™ GPUs November 17, 2025	AMD Inference Microservice (AIM): Production Ready Inference on AMD Instinct™ GPUs November 17, 2025
Shashank Kashyap	1	AMD Inference Microservice (AIM): Production Ready Inference on AMD Instinct™ GPUs November 17, 2025	AMD Inference Microservice (AIM): Production Ready Inference on AMD Instinct™ GPUs November 17, 2025
Miikael Leskinen	1	AMD Inference Microservice (AIM): Production Ready Inference on AMD Instinct™ GPUs November 17, 2025	AMD Inference Microservice (AIM): Production Ready Inference on AMD Instinct™ GPUs November 17, 2025
Hari Nair	1	AMD Inference Microservice (AIM): Production Ready Inference on AMD Instinct™ GPUs November 17, 2025	AMD Inference Microservice (AIM): Production Ready Inference on AMD Instinct™ GPUs November 17, 2025
Mika Ranta	1	AMD Inference Microservice (AIM): Production Ready Inference on AMD Instinct™ GPUs November 17, 2025	AMD Inference Microservice (AIM): Production Ready Inference on AMD Instinct™ GPUs November 17, 2025
Mario Reiser	1	AMD Inference Microservice (AIM): Production Ready Inference on AMD Instinct™ GPUs November 17, 2025	AMD Inference Microservice (AIM): Production Ready Inference on AMD Instinct™ GPUs November 17, 2025
Alex Saliniemi	1	AMD Inference Microservice (AIM): Production Ready Inference on AMD Instinct™ GPUs November 17, 2025	AMD Inference Microservice (AIM): Production Ready Inference on AMD Instinct™ GPUs November 17, 2025
Harry Souris	1	AMD Inference Microservice (AIM): Production Ready Inference on AMD Instinct™ GPUs November 17, 2025	AMD Inference Microservice (AIM): Production Ready Inference on AMD Instinct™ GPUs November 17, 2025
Antti-Ville Suni	1	AMD Inference Microservice (AIM): Production Ready Inference on AMD Instinct™ GPUs November 17, 2025	AMD Inference Microservice (AIM): Production Ready Inference on AMD Instinct™ GPUs November 17, 2025
Robert Talling	1	AMD Inference Microservice (AIM): Production Ready Inference on AMD Instinct™ GPUs November 17, 2025	AMD Inference Microservice (AIM): Production Ready Inference on AMD Instinct™ GPUs November 17, 2025
Mikko Vilenius	1	AMD Inference Microservice (AIM): Production Ready Inference on AMD Instinct™ GPUs November 17, 2025	AMD Inference Microservice (AIM): Production Ready Inference on AMD Instinct™ GPUs November 17, 2025
Bo Zhang	1	AMD Inference Microservice (AIM): Production Ready Inference on AMD Instinct™ GPUs November 17, 2025	AMD Inference Microservice (AIM): Production Ready Inference on AMD Instinct™ GPUs November 17, 2025
Mingjie Lu	1	Accelerating Autonomous Driving Model Training on AMD ROCm™ Software December 08, 2025	Accelerating Autonomous Driving Model Training on AMD ROCm™ Software December 08, 2025
Treemann Zheng	1	Accelerating Autonomous Driving Model Training on AMD ROCm™ Software December 08, 2025	Accelerating Autonomous Driving Model Training on AMD ROCm™ Software December 08, 2025
Zicheng Liu	1	LuminaSFT: Generating Synthetic Fine-Tuning Data for Small Language Models February 24, 2026	LuminaSFT: Generating Synthetic Fine-Tuning Data for Small Language Models February 24, 2026
Jassani Adeem	1	Mamba on AMD GPUs with ROCm June 28, 2024	Mamba on AMD GPUs with ROCm June 28, 2024
Arseny Moskvichev	1	Mamba on AMD GPUs with ROCm June 28, 2024	Mamba on AMD GPUs with ROCm June 28, 2024
Hui Liu	1	SGLang: Fast Serving Framework for Large Language and Vision-Language Models on AMD Instinct GPUs November 13, 2024	SGLang: Fast Serving Framework for Large Language and Vision-Language Models on AMD Instinct GPUs November 13, 2024
Yineng Zhang	1	SGLang: Fast Serving Framework for Large Language and Vision-Language Models on AMD Instinct GPUs November 13, 2024	SGLang: Fast Serving Framework for Large Language and Vision-Language Models on AMD Instinct GPUs November 13, 2024
Theresa Shan	1	Edge-to-Cloud Robotics with AMD ROCm: From Data Collection to Real-Time Inference March 23, 2026	Edge-to-Cloud Robotics with AMD ROCm: From Data Collection to Real-Time Inference March 23, 2026
Eda Zhou	1	Edge-to-Cloud Robotics with AMD ROCm: From Data Collection to Real-Time Inference March 23, 2026	Edge-to-Cloud Robotics with AMD ROCm: From Data Collection to Real-Time Inference March 23, 2026
Antti Virtanen	1	Continued Pretraining: A Practical Playbook for Language-Specific LLM Adaptation June 18, 2025	Continued Pretraining: A Practical Playbook for Language-Specific LLM Adaptation June 18, 2025
Jeremy Arnold	1	Benchmarking Machine Learning using ROCm and AMD GPUs: Reproducing Our MLPerf Inference Submission August 28, 2024	Benchmarking Machine Learning using ROCm and AMD GPUs: Reproducing Our MLPerf Inference Submission August 28, 2024
Zhao Lin	1	Technical Dive into AMD's MLPerf Inference v5.1 Submission September 09, 2025	Technical Dive into AMD's MLPerf Inference v5.1 Submission September 09, 2025
Niels Zhang	1	Technical Dive into AMD's MLPerf Inference v5.1 Submission September 09, 2025	Technical Dive into AMD's MLPerf Inference v5.1 Submission September 09, 2025
Vinayak Gokhale	1	Technical Dive into AMD's MLPerf Inference v5.1 Submission September 09, 2025	Technical Dive into AMD's MLPerf Inference v5.1 Submission September 09, 2025
Ethan Lin	1	Customizing Kernels with hipBLASLt TensileLite GEMM Tuning - Advanced User Guide April 06, 2026	Customizing Kernels with hipBLASLt TensileLite GEMM Tuning - Advanced User Guide April 06, 2026
Yosi Hatekar	1	STX-B0T: Real-time AI Robot Assistant Powered by RyzenAI and ROCm October 23, 2025	STX-B0T: Real-time AI Robot Assistant Powered by RyzenAI and ROCm October 23, 2025
Hyunji Kim	1	STX-B0T: Real-time AI Robot Assistant Powered by RyzenAI and ROCm October 23, 2025	STX-B0T: Real-time AI Robot Assistant Powered by RyzenAI and ROCm October 23, 2025
Alex Bogdan	1	STX-B0T: Real-time AI Robot Assistant Powered by RyzenAI and ROCm October 23, 2025	STX-B0T: Real-time AI Robot Assistant Powered by RyzenAI and ROCm October 23, 2025
Haani Ahmed	1	Semantic Fencing of Video Streams Using Embedding Splits from Vision Foundation Models May 15, 2026	Semantic Fencing of Video Streams Using Embedding Splits from Vision Foundation Models May 15, 2026
Matthieu Chan Chee	1	Semantic Fencing of Video Streams Using Embedding Splits from Vision Foundation Models May 15, 2026	Semantic Fencing of Video Streams Using Embedding Splits from Vision Foundation Models May 15, 2026
Max Kiehn	1	Semantic Fencing of Video Streams Using Embedding Splits from Vision Foundation Models May 15, 2026	Semantic Fencing of Video Streams Using Embedding Splits from Vision Foundation Models May 15, 2026
Dai Yan	1	Accelerating Kimi-K2.5 on AMD Instinct™ MI300X: Optimizing Fused MoE with FlyDSL March 24, 2026	Accelerating Kimi-K2.5 on AMD Instinct™ MI300X: Optimizing Fused MoE with FlyDSL March 24, 2026
Yamini Kamisetty	1	Reproducing the AMD Instinct™ GPUs MLPerf Inference v5.1 Submission September 09, 2025	Reproducing the AMD Instinct™ GPUs MLPerf Inference v5.1 Submission September 09, 2025
Chelsea Iluno	1	Reproducing the AMD Instinct™ GPUs MLPerf Inference v5.1 Submission September 09, 2025	Reproducing the AMD Instinct™ GPUs MLPerf Inference v5.1 Submission September 09, 2025
Joyce Zhang	1	Optimizing LLM Workloads: AMD Instinct MI355X GPUs Drive Competitive Performance December 02, 2025	Optimizing LLM Workloads: AMD Instinct MI355X GPUs Drive Competitive Performance December 02, 2025
Jared Bowden	1	Optimizing LLM Workloads: AMD Instinct MI355X GPUs Drive Competitive Performance December 02, 2025	Optimizing LLM Workloads: AMD Instinct MI355X GPUs Drive Competitive Performance December 02, 2025
Shubin Zhao	1	Optimizing LLM Workloads: AMD Instinct MI355X GPUs Drive Competitive Performance December 02, 2025	Optimizing LLM Workloads: AMD Instinct MI355X GPUs Drive Competitive Performance December 02, 2025
Keith Anderson	1	Optimizing LLM Workloads: AMD Instinct MI355X GPUs Drive Competitive Performance December 02, 2025	Optimizing LLM Workloads: AMD Instinct MI355X GPUs Drive Competitive Performance December 02, 2025
Kajsa Arnold	1	Optimizing LLM Workloads: AMD Instinct MI355X GPUs Drive Competitive Performance December 02, 2025	Optimizing LLM Workloads: AMD Instinct MI355X GPUs Drive Competitive Performance December 02, 2025
Andrew Ma	1	Optimizing LLM Workloads: AMD Instinct MI355X GPUs Drive Competitive Performance December 02, 2025	Optimizing LLM Workloads: AMD Instinct MI355X GPUs Drive Competitive Performance December 02, 2025
Sriranjani Ramasubramanian	1	Optimizing LLM Workloads: AMD Instinct MI355X GPUs Drive Competitive Performance December 02, 2025	Optimizing LLM Workloads: AMD Instinct MI355X GPUs Drive Competitive Performance December 02, 2025
Deval Shah	1	Optimizing LLM Workloads: AMD Instinct MI355X GPUs Drive Competitive Performance December 02, 2025	Optimizing LLM Workloads: AMD Instinct MI355X GPUs Drive Competitive Performance December 02, 2025
Lorri Rao	1	Optimizing LLM Workloads: AMD Instinct MI355X GPUs Drive Competitive Performance December 02, 2025	Optimizing LLM Workloads: AMD Instinct MI355X GPUs Drive Competitive Performance December 02, 2025
Benran Hu	1	Instella-T2I: Open-Source Text-to-Image with 1D Tokenizer and 32× Token Reduction on AMD GPUs  July 15, 2025	Instella-T2I: Open-Source Text-to-Image with 1D Tokenizer and 32× Token Reduction on AMD GPUs  July 15, 2025
Hongyi Yao	1	Breaking the Accuracy-Speed Barrier: How MXFP4/6 Quantization Revolutionizes Image and Video Generation January 07, 2026	Breaking the Accuracy-Speed Barrier: How MXFP4/6 Quantization Revolutionizes Image and Video Generation January 07, 2026
Han Wang	1	Breaking the Accuracy-Speed Barrier: How MXFP4/6 Quantization Revolutionizes Image and Video Generation January 07, 2026	Breaking the Accuracy-Speed Barrier: How MXFP4/6 Quantization Revolutionizes Image and Video Generation January 07, 2026
Ephrem Wu	1	Breaking the Accuracy-Speed Barrier: How MXFP4/6 Quantization Revolutionizes Image and Video Generation January 07, 2026	Breaking the Accuracy-Speed Barrier: How MXFP4/6 Quantization Revolutionizes Image and Video Generation January 07, 2026
Gerardo del Muro Gonzalez	1	GROMACS on AMD Instinct GPUs: A Complete Build Guide March 24, 2026	GROMACS on AMD Instinct GPUs: A Complete Build Guide March 24, 2026
Sonya Yang	1	Digital Twins on AMD: Building Robotic Simulations Using Edge AI PCs February 09, 2026	Digital Twins on AMD: Building Robotic Simulations Using Edge AI PCs February 09, 2026
Yuhang Song	1	Digital Twins on AMD: Building Robotic Simulations Using Edge AI PCs February 09, 2026	Digital Twins on AMD: Building Robotic Simulations Using Edge AI PCs February 09, 2026
Yuzhou Lu	1	Digital Twins on AMD: Building Robotic Simulations Using Edge AI PCs February 09, 2026	Digital Twins on AMD: Building Robotic Simulations Using Edge AI PCs February 09, 2026
Joshua Lu	1	Digital Twins on AMD: Building Robotic Simulations Using Edge AI PCs February 09, 2026	Digital Twins on AMD: Building Robotic Simulations Using Edge AI PCs February 09, 2026
Marilyn Basanta	1	ROCm 7.0: An AI-Ready Powerhouse for Performance, Efficiency, and Productivity September 16, 2025	ROCm 7.0: An AI-Ready Powerhouse for Performance, Efficiency, and Productivity September 16, 2025
Brian Cornille	1	Introducing AMD's Next-Gen Fortran Compiler November 13, 2024	Introducing AMD's Next-Gen Fortran Compiler November 13, 2024
Michael Klemm	1	Introducing AMD's Next-Gen Fortran Compiler November 13, 2024	Introducing AMD's Next-Gen Fortran Compiler November 13, 2024
Johanna Potyka	1	Introducing AMD's Next-Gen Fortran Compiler November 13, 2024	Introducing AMD's Next-Gen Fortran Compiler November 13, 2024
Ronnie Chatterjee	1	ROCm 6.4: Breaking Barriers in AI, HPC, and Modular GPU Software April 11, 2025	ROCm 6.4: Breaking Barriers in AI, HPC, and Modular GPU Software April 11, 2025
Amy Wiebe	1	ROCm 7.14: TheRock Goes Production and Expands AMD's AI Software Platform July 15, 2026	ROCm 7.14: TheRock Goes Production and Expands AMD's AI Software Platform July 15, 2026
Martin Huarte	1	Boosting Computational Fluid Dynamics Performance with AMD Instinct™ MI300X January 14, 2025	Boosting Computational Fluid Dynamics Performance with AMD Instinct™ MI300X January 14, 2025
Carmine Zaccagnino	1	Styled Text Image Generation with Eruku on AMD April 24, 2026	Styled Text Image Generation with Eruku on AMD April 24, 2026
Fabio Quattrini	1	Styled Text Image Generation with Eruku on AMD April 24, 2026	Styled Text Image Generation with Eruku on AMD April 24, 2026
Vittorio Pippi	1	Styled Text Image Generation with Eruku on AMD April 24, 2026	Styled Text Image Generation with Eruku on AMD April 24, 2026
Silvia Cascianelli	1	Styled Text Image Generation with Eruku on AMD April 24, 2026	Styled Text Image Generation with Eruku on AMD April 24, 2026
Alessio Tonioni	1	Styled Text Image Generation with Eruku on AMD April 24, 2026	Styled Text Image Generation with Eruku on AMD April 24, 2026
Yazhini Rajesh	1	Using Gradient Boosting Libraries on MI300X for Financial Risk Prediction January 08, 2026	Using Gradient Boosting Libraries on MI300X for Financial Risk Prediction January 08, 2026
Niles Burbank	1	Day 0 Developer Guide: Running the Latest Open Models from OpenAI on AMD AI Hardware August 05, 2025	Day 0 Developer Guide: Running the Latest Open Models from OpenAI on AMD AI Hardware August 05, 2025
Kailash Gogineni,Xun Wang	1	Day 0 Developer Guide: Running the Latest Open Models from OpenAI on AMD AI Hardware August 05, 2025	Day 0 Developer Guide: Running the Latest Open Models from OpenAI on AMD AI Hardware August 05, 2025
Christophe Paquot	1	HIP 7.0 Is Coming: What You Need to Know to Stay Ahead May 28, 2025	HIP 7.0 Is Coming: What You Need to Know to Stay Ahead May 28, 2025
Julia Jiang	1	HIP 7.0 Is Coming: What You Need to Know to Stay Ahead May 28, 2025	HIP 7.0 Is Coming: What You Need to Know to Stay Ahead May 28, 2025
Denny Iriawan	1	HIP 7.0 Is Coming: What You Need to Know to Stay Ahead May 28, 2025	HIP 7.0 Is Coming: What You Need to Know to Stay Ahead May 28, 2025
Paul Hartke	1	Democratizing AI Compute with AMD Using SkyPilot November 13, 2025	Democratizing AI Compute with AMD Using SkyPilot November 13, 2025
Romil Bhardwaj	1	Democratizing AI Compute with AMD Using SkyPilot November 13, 2025	Democratizing AI Compute with AMD Using SkyPilot November 13, 2025
Zongheng Yang	1	Democratizing AI Compute with AMD Using SkyPilot November 13, 2025	Democratizing AI Compute with AMD Using SkyPilot November 13, 2025
Zhanghao Wu	1	Democratizing AI Compute with AMD Using SkyPilot November 13, 2025	Democratizing AI Compute with AMD Using SkyPilot November 13, 2025
Quentin Anthony	1	Training Transformers and Hybrid models on AMD Instinct MI300X Accelerators December 10, 2024	Training Transformers and Hybrid models on AMD Instinct MI300X Accelerators December 10, 2024
Paul Mullowney	1	Sparse matrix vector multiplication - part 1 November 03, 2023	Sparse matrix vector multiplication - part 1 November 03, 2023

ROCm Blogs Statistics

ROCm Blogs Statistics#

Blog Count

Monthly Blog Publication

Blogs by Tag

Top Authors

Blogs by Category