| Eliot Li | 17 | From Ingestion to Inference: RAG Pipelines on AMD GPUs October 02, 2025 | Scale AI applications with Ray April 01, 2024 |
| Fabricio Flores | 16 | From Ingestion to Inference: RAG Pipelines on AMD GPUs October 02, 2025 | Building semantic search with SentenceTransformers on AMD April 04, 2024 |
| Clint Greene | 16 | Enabling FlashInfer on ROCm for Accelerated LLM Serving October 01, 2025 | Accelerating XGBoost with Dask using multiple AMD GPUs January 26, 2024 |
| George Wang | 15 | Kimi-K2-Instruct: Enhanced Out-of-the-Box Performance on AMD Instinct MI355 Series GPUs October 16, 2025 | GEMM Kernel Optimization For AMD GPUs February 06, 2025 |
| Sean Song | 15 | LLM Quantization with Quark on AMD GPUs: Accuracy and Performance Evaluation June 09, 2025 | Fine-tune Llama model with LoRA: Customizing a large language model for question-answering February 01, 2024 |
| Douglas Jia | 14 | Multinode Fine-Tuning of Stable Diffusion XL on AMD GPUs with Hugging Face Accelerate and OCI's Kubernetes Engine (OKE) October 15, 2024 | Efficient image generation with Stable Diffusion models and AITemplate using AMD GPUs January 24, 2024 |
| Andy Luo | 14 | Stability at Scale: AMD’s Full‑Stack Platform for Large‑Model Training November 04, 2025 | Best practices for competitive inference optimization on AMD Instinct™ MI300X GPUs January 29, 2025 |
| Emad Barsoum | 13 | Nitro-E: A 304M Diffusion Transformer Model for High Quality Image Generation October 24, 2025 | Enhancing AI Training with AMD ROCm Software January 31, 2025 |
| Vish Vadlamani | 13 | Announcing MONAI 1.0.0 for AMD ROCm: Breakthrough AI Acceleration for Medical Imaging Models on AMD Instinct™ GPUs October 07, 2025 | Triton Inference Server with vLLM on AMD GPUs January 08, 2025 |
| Phillip Dang | 13 | DBRX Instruct on AMD GPUs July 11, 2024 | Simplifying deep learning: A guide to PyTorch Lightning February 08, 2024 |
| Anshul Gupta | 12 | Continuing the Momentum: Refining ROCm For The Next Wave Of AI and HPC November 05, 2025 | GEMM Kernel Optimization For AMD GPUs February 06, 2025 |
| Phani Vaddadi | 12 | Announcing MONAI 1.0.0 for AMD ROCm: Breakthrough AI Acceleration for Medical Imaging Models on AMD Instinct™ GPUs October 07, 2025 | Efficient MoE training on AMD ROCm: How-to use MegaBlocks on AMD GPUs March 23, 2025 |
| Saad Rahim | 12 | Continuing the Momentum: Refining ROCm For The Next Wave Of AI and HPC November 05, 2025 | ROCm Gets Modular: Meet the Instinct Datacenter GPU Driver April 11, 2025 |
| Vara Lakshmi Bayanagari | 11 | Distributed fine-tuning of MPT-30B using Composer on AMD GPUs January 28, 2025 | Pre-training BERT using Hugging Face & PyTorch on an AMD GPU January 26, 2024 |
| Yao Liu | 11 | From Ingestion to Inference: RAG Pipelines on AMD GPUs October 02, 2025 | Triton Inference Server with vLLM on AMD GPUs January 08, 2025 |
| Gina Sitaraman | 11 | Performance Profiling on AMD GPUs - Part 3: Advanced Usage October 23, 2025 | AMD matrix cores November 14, 2022 |
| Justin Chang | 10 | MI300A - Exploring the APU advantage February 09, 2025 | Finite difference method - Laplacian part 1 November 14, 2022 |
| Meena Arunachalam | 9 | Technical Dive into AMD's MLPerf Inference v5.1 Submission September 09, 2025 | Benchmarking Machine Learning using ROCm and AMD GPUs: Reproducing Our MLPerf Inference Submission August 28, 2024 |
| Miro Hodak | 9 | Technical Dive into AMD's MLPerf Inference v5.1 Submission September 09, 2025 | Benchmarking Machine Learning using ROCm and AMD GPUs: Reproducing Our MLPerf Inference Submission August 28, 2024 |
| Liz Li | 9 | Stability at Scale: AMD’s Full‑Stack Platform for Large‑Model Training November 04, 2025 | AITER: AI Tensor Engine For ROCm March 21, 2025 |
| Thomas Gibson | 9 | Performance Profiling on AMD GPUs - Part 3: Advanced Usage October 23, 2025 | Finite difference method - Laplacian part 1 November 14, 2022 |
| Zicheng Liu | 8 | Day-0 Support for the SGLang-Native RL Framework - slime on AMD Instinct™ GPUs September 25, 2025 | Introducing Instella: New State-of-the-art Fully Open 3B Language Models March 05, 2025 |
| Yusheng Su | 7 | Day-0 Support for the SGLang-Native RL Framework - slime on AMD Instinct™ GPUs September 25, 2025 | Introducing Instella: New State-of-the-art Fully Open 3B Language Models March 05, 2025 |
| Seungrok Jung | 7 | vLLM V1 Meets AMD Instinct GPUs: A New Era for LLM Inference Performance July 07, 2025 | Large language model inference optimizations on AMD GPUs March 15, 2024 |
| Dong Li | 7 | Nitro-E: A 304M Diffusion Transformer Model for High Quality Image Generation October 24, 2025 | GEAK: Introducing Triton Kernel AI Agent & Evaluation Benchmarks August 01, 2025 |
| Ossian O'Reilly | 7 | Seismic stencil codes - part 1 August 29, 2024 | Finite difference method - Laplacian part 1 November 14, 2022 |
| Ximeng Sun | 6 | Day-0 Support for the SGLang-Native RL Framework - slime on AMD Instinct™ GPUs September 25, 2025 | Introducing Instella: New State-of-the-art Fully Open 3B Language Models March 05, 2025 |
| Ze Wang | 6 | Day-0 Support for the SGLang-Native RL Framework - slime on AMD Instinct™ GPUs September 25, 2025 | Introducing Instella: New State-of-the-art Fully Open 3B Language Models March 05, 2025 |
| Jiang Liu | 6 | Day-0 Support for the SGLang-Native RL Framework - slime on AMD Instinct™ GPUs September 25, 2025 | Introducing Instella: New State-of-the-art Fully Open 3B Language Models March 05, 2025 |
| Jialian Wu | 6 | Day-0 Support for the SGLang-Native RL Framework - slime on AMD Instinct™ GPUs September 25, 2025 | Introducing Instella: New State-of-the-art Fully Open 3B Language Models March 05, 2025 |
| Xiaodong Yu | 6 | Day-0 Support for the SGLang-Native RL Framework - slime on AMD Instinct™ GPUs September 25, 2025 | Introducing Instella: New State-of-the-art Fully Open 3B Language Models March 05, 2025 |
| Karan Verma | 6 | Slim Down Your Llama: Pruning & Fine-Tuning for Maximum Performance September 09, 2025 | Reproducing the AMD Instinct™ GPUs MLPerf Inference v5.0 Submission April 02, 2025 |
| Marco Grond | 6 | Elevating 3D Scene Rendering with GSplat October 03, 2025 | ROCm 6.4: Breaking Barriers in AI, HPC, and Modular GPU Software April 11, 2025 |
| Liam Berry | 6 | Continuing the Momentum: Refining ROCm For The Next Wave Of AI and HPC November 05, 2025 | ROCm Runfile Installer Is Here! May 22, 2025 |
| Gowtham Ramesh | 5 | Day-0 Support for the SGLang-Native RL Framework - slime on AMD Instinct™ GPUs September 25, 2025 | Introducing Instella: New State-of-the-art Fully Open 3B Language Models March 05, 2025 |
| Peng Sun | 5 | Empowering Developers to Build a Robust PyTorch Ecosystem on AMD ROCm™ with Better Insights and Monitoring October 21, 2025 | Supercharge DeepSeek-R1 Inference on AMD Instinct MI300X March 21, 2025 |
| Shekhar Pandey | 5 | Day 0 Developer Guide: Running the Latest Open Models from OpenAI on AMD AI Hardware August 05, 2025 | Deploying Google’s Gemma 3 Model with vLLM on AMD Instinct™ MI300X GPUs: A Step-by-Step Guide March 14, 2025 |
| Xuanwu Yin | 5 | Gumiho: A New Paradigm for Speculative Decoding — Earlier Tokens in a Draft Sequence Matter More October 14, 2025 | Introducing AMD EVLM: Efficient Vision-Language Models with Parameter-Space Visual Conditioning August 22, 2025 |
| Sean Miller | 5 | Finite difference method - Laplacian part 4 July 18, 2023 | Finite difference method - Laplacian part 1 November 14, 2022 |
| Rajat Arora | 5 | Jacobi Solver with HIP and OpenMP offloading September 15, 2023 | Finite difference method - Laplacian part 1 November 14, 2022 |
| Matt Elliott | 5 | How to Build a vLLM Container for Inference and Benchmarking February 21, 2025 | Presenting and demonstrating the use of the ROCm Offline Installer Creator, a tool enabling simple deployment of ROCm in disconnected environments in high-security environments and air-gapped networks. September 10, 2024 |
| Alessandro Fanfarillo | 5 | Performance Profiling on AMD GPUs - Part 3: Advanced Usage October 23, 2025 | Register pressure in AMD CDNA™2 GPUs May 17, 2023 |
| Asitav Mishra | 5 | Performance Profiling on AMD GPUs - Part 3: Advanced Usage October 23, 2025 | Jacobi Solver with HIP and OpenMP offloading September 15, 2023 |
| Prakamya Mishra | 4 | Introducing Instella-Math: Fully Open Language Model with Reasoning Capability August 09, 2025 | Introducing Instella: New State-of-the-art Fully Open 3B Language Models March 05, 2025 |
| Sudhanshu Ranjan | 4 | Introducing Instella-Math: Fully Open Language Model with Reasoning Capability August 09, 2025 | Introducing Instella: New State-of-the-art Fully Open 3B Language Models March 05, 2025 |
| Wei-Ting Liao | 4 | Technical Dive into AMD's MLPerf Inference v5.1 Submission September 09, 2025 | Reproducing the AMD Instinct™ GPUs MLPerf Inference v5.0 Submission April 02, 2025 |
| Poovaiah Palangappa | 4 | Technical Dive into AMD's MLPerf Inference v5.1 Submission September 09, 2025 | AMD Instinct™ MI325X GPUs Produce Strong Performance in MLPerf Inference v5.0 April 02, 2025 |
| Vikas C Sajjan | 4 | Announcing MONAI 1.0.0 for AMD ROCm: Breakthrough AI Acceleration for Medical Imaging Models on AMD Instinct™ GPUs October 07, 2025 | Announcing hipCIM: A Cutting-Edge Solution for Accelerated Multidimensional Image Processing July 18, 2025 |
| Logan Grado | 4 | Accelerating models on ROCm using PyTorch TunableOp July 03, 2024 | Automatic mixed precision in PyTorch using AMD GPUs March 29, 2024 |
| Wei Luo | 4 | Day 0 Developer Guide: hipBLASLt Offline GEMM Tuning Script November 05, 2025 | QuickReduce: Up to 3x Faster All-reduce for vLLM and SGLang August 26, 2025 |
| Spandan Tiwari | 4 | Day 0 Developer Guide: hipBLASLt Offline GEMM Tuning Script November 05, 2025 | QuickReduce: Up to 3x Faster All-reduce for vLLM and SGLang August 26, 2025 |
| Yixing Xu | 4 | Gumiho: A New Paradigm for Speculative Decoding — Earlier Tokens in a Draft Sequence Matter More October 14, 2025 | Reproducing the AMD Instinct™ GPUs MLPerf Inference v5.1 Submission September 09, 2025 |
| Mohammed Faraaz Mustafa | 4 | ROCm 7.0: An AI-Ready Powerhouse for Performance, Efficiency, and Productivity September 16, 2025 | ROCm Revisited: Getting Started with HIP June 06, 2025 |
| Jayacharan Kolla | 4 | ROCm 6.4: Breaking Barriers in AI, HPC, and Modular GPU Software April 11, 2025 | Understanding RCCL Bandwidth and xGMI Performance on AMD Instinct™ MI300X March 02, 2025 |
| Yao Fu | 4 | Stability at Scale: AMD’s Full‑Stack Platform for Large‑Model Training November 04, 2025 | Optimized ROCm Docker for Distributed AI Training March 13, 2025 |
| Ning Zhang | 3 | Step-3 Deployment Simplified: A Day 0 Developer’s Guide on AMD Instinct™ GPUs September 04, 2025 | GEMM Kernel Optimization For AMD GPUs February 06, 2025 |
| Pratik Prabhanjan Brahma | 3 | GEAK: Introducing Triton Kernel AI Agent & Evaluation Benchmarks August 01, 2025 | Introducing Instella: New State-of-the-art Fully Open 3B Language Models March 05, 2025 |
| Ravi Dwivedula | 3 | Reproduce AMD's MLPerf Training v5.0 Submission Result with Instinct™ GPUs June 04, 2025 | High-Throughput BERT-L Pre-Training on AMD Instinct™ GPUs: A Practical Guide June 03, 2025 |
| Su Ann Chong | 3 | Reproduce AMD's MLPerf Training v5.0 Submission Result with Instinct™ GPUs June 04, 2025 | High-Throughput BERT-L Pre-Training on AMD Instinct™ GPUs: A Practical Guide June 03, 2025 |
| Sarthak Arora | 3 | Reproduce AMD's MLPerf Training v5.0 Submission Result with Instinct™ GPUs June 04, 2025 | High-Throughput BERT-L Pre-Training on AMD Instinct™ GPUs: A Practical Guide June 03, 2025 |
| Sathish Sanjeevi | 3 | Reproduce AMD's MLPerf Training v5.0 Submission Result with Instinct™ GPUs June 04, 2025 | High-Throughput BERT-L Pre-Training on AMD Instinct™ GPUs: A Practical Guide June 03, 2025 |
| Victor Robles | 3 | AI Inference Orchestration with Kubernetes on Instinct MI300X, Part 3 March 13, 2025 | AI Inference Orchestration with Kubernetes on Instinct MI300X, Part 1 February 07, 2025 |
| Mahdi Ghodsi | 3 | Day 0 Developer Guide: Running the Latest Open Models from OpenAI on AMD AI Hardware August 05, 2025 | Power Up Qwen 3 with AMD Instinct: A Developer’s Day 0 Quickstart April 28, 2025 |
| Vicky Tsang | 3 | Exploring Use Cases for Scalable AI: Implementing Ray with ROCm Support for Efficient ML Workflows September 10, 2025 | Scale AI applications with Ray April 01, 2024 |
| Alex He | 3 | A Step-by-Step Guide On How To Deploy Llama Stack on AMD Instinct™ GPU April 22, 2025 | Navigating vLLM Inference with ROCm and Kubernetes February 13, 2025 |
| Charles Yang | 3 | Optimizing FP4 Mixed-Precision Inference with Petit on AMD Instinct MI250 and MI300 GPUs: A Developer’s Perspective October 06, 2025 | Vibe Coding Pac-Man Inspired Game with DeepSeek-R1 and AMD Instinct MI300X July 17, 2025 |
| Soumitra Chatterjee | 3 | Announcing MONAI 1.0.0 for AMD ROCm: Breakthrough AI Acceleration for Medical Imaging Models on AMD Instinct™ GPUs October 07, 2025 | Announcing hipCIM: A Cutting-Edge Solution for Accelerated Multidimensional Image Processing July 18, 2025 |
| Anik Chaudhuri | 3 | Announcing MONAI 1.0.0 for AMD ROCm: Breakthrough AI Acceleration for Medical Imaging Models on AMD Instinct™ GPUs October 07, 2025 | Announcing hipCIM: A Cutting-Edge Solution for Accelerated Multidimensional Image Processing July 18, 2025 |
| Dominic Widdows | 3 | ROCm 7.9 Technology Preview: ROCm Core SDK and TheRock Build System October 20, 2025 | Benchmarking Reasoning Models: From Tokens to Answers July 24, 2025 |
| Rui Sampaio | 3 | Medical Imaging on MI300X: Optimized SwinUNETR for Tumor Detection October 07, 2025 | Optimizing Drug Discovery Tools on AMD MI300X Part 1: Molecular Design with REINVENT September 19, 2025 |
| Xinjun Niu | 3 | Day 0 Developer Guide: hipBLASLt Offline GEMM Tuning Script November 05, 2025 | QuickReduce: Up to 3x Faster All-reduce for vLLM and SGLang August 26, 2025 |
| Ashish Sirasao | 3 | Day 0 Developer Guide: hipBLASLt Offline GEMM Tuning Script November 05, 2025 | QuickReduce: Up to 3x Faster All-reduce for vLLM and SGLang August 26, 2025 |
| Mukhil Azhagan Mallaiyan Sathiaseelan | 3 | Enabling FlashInfer on ROCm for Accelerated LLM Serving October 01, 2025 | Graph Neural Networks at Scale: DGL with ROCm on AMD Hardware July 31, 2025 |
| Anuya Welling | 3 | From Ingestion to Inference: RAG Pipelines on AMD GPUs October 02, 2025 | Graph Neural Networks at Scale: DGL with ROCm on AMD Hardware July 31, 2025 |
| Hongxia Yang | 3 | Empowering Developers to Build a Robust PyTorch Ecosystem on AMD ROCm™ with Better Insights and Monitoring October 21, 2025 | Accelerated LLM Inference on AMD Instinct™ GPUs with vLLM 0.9.x and ROCm June 28, 2025 |
| Ean Garvey | 3 | Technical Dive into AMD's MLPerf Inference v5.1 Submission September 09, 2025 | Reproducing the AMD Instinct™ GPUs MLPerf Inference v5.0 Submission April 02, 2025 |
| Kumar Deepak | 3 | Technical Dive into AMD's MLPerf Inference v5.1 Submission September 09, 2025 | Reproducing the AMD Instinct™ GPUs MLPerf Inference v5.0 Submission April 02, 2025 |
| Fulu Li | 3 | Technical Dive into AMD's MLPerf Inference v5.1 Submission September 09, 2025 | Reproducing the AMD Instinct™ GPUs MLPerf Inference v5.1 Submission September 09, 2025 |
| Zhe Li | 3 | Technical Dive into AMD's MLPerf Inference v5.1 Submission September 09, 2025 | Reproducing the AMD Instinct™ GPUs MLPerf Inference v5.1 Submission September 09, 2025 |
| Guanchen Li | 3 | Technical Dive into AMD's MLPerf Inference v5.1 Submission September 09, 2025 | Reproducing the AMD Instinct™ GPUs MLPerf Inference v5.1 Submission September 09, 2025 |
| Carson Liao | 3 | Day 0 Developer Guide: hipBLASLt Offline GEMM Tuning Script November 05, 2025 | GEMM Tuning within hipBLASLt - Part 1 September 05, 2025 |
| Farshad Ghodsian | 3 | ROCm 6.4: Breaking Barriers in AI, HPC, and Modular GPU Software April 11, 2025 | Announcing the AMD GPU Operator and Metrics Exporter January 29, 2025 |
| Zhenyu Gu | 3 | Stability at Scale: AMD’s Full‑Stack Platform for Large‑Model Training November 04, 2025 | Day 0 Developer Guide: Running the Latest Open Models from OpenAI on AMD AI Hardware August 05, 2025 |
| Suyash Tandon | 3 | MI300A - Exploring the APU advantage February 09, 2025 | Introduction to profiling tools for AMD hardware April 12, 2023 |
| Bob Robey | 3 | Application portability with HIP April 26, 2024 | Affinity part 2 - System topology and controlling affinity April 16, 2024 |
| David Li | 3 | Avoiding LDS Bank Conflicts on AMD GPUs Using CK-Tile Framework July 25, 2025 | Hands-On with CK-Tile: Develop and Run Optimized GEMM on AMD GPUs April 15, 2025 |
| Karthik Kashyap Thatipamula | 3 | Elevating 3D Scene Rendering with GSplat October 03, 2025 | Announcing hipCIM: A Cutting-Edge Solution for Accelerated Multidimensional Image Processing July 18, 2025 |
| Deeksha Goplani | 3 | Elevating 3D Scene Rendering with GSplat October 03, 2025 | Announcing hipCIM: A Cutting-Edge Solution for Accelerated Multidimensional Image Processing July 18, 2025 |
| Ish Kool | 3 | Elevating 3D Scene Rendering with GSplat October 03, 2025 | Announcing hipCIM: A Cutting-Edge Solution for Accelerated Multidimensional Image Processing July 18, 2025 |
| Luka Stanisic | 3 | Performance Profiling on AMD GPUs - Part 3: Advanced Usage October 23, 2025 | Performance Profiling on AMD GPUs – Part 1: Foundations June 26, 2025 |
| Giacomo Capodaglio | 3 | Performance Profiling on AMD GPUs - Part 3: Advanced Usage October 23, 2025 | Performance Profiling on AMD GPUs – Part 1: Foundations June 26, 2025 |
| Vikram Appia | 2 | AMD-HybridLM: Towards Extremely Efficient Hybrid Language Models September 17, 2025 | Beyond Text: Accelerating Multimodal AI Inference with Speculative Decoding on AMD Instinct™ MI300X GPUs April 28, 2025 |
| Hai Xiao | 2 | Supercharge DeepSeek-R1 Inference on AMD Instinct MI300X March 21, 2025 | SGLang: Fast Serving Framework for Large Language and Vision-Language Models on AMD Instinct GPUs November 13, 2024 |
| Sonali Singh | 2 | Accelerating LLM Inference: Up to 3x Speedup on MI300X with Speculative Decoding March 27, 2025 | Deep dive into the MI300 compute and memory partition modes February 09, 2025 |
| Karthik Sangaiah | 2 | Accelerating LLM Inference: Up to 3x Speedup on MI300X with Speculative Decoding March 27, 2025 | Deep dive into the MI300 compute and memory partition modes February 09, 2025 |
| Ryan Swann | 2 | Accelerating LLM Inference: Up to 3x Speedup on MI300X with Speculative Decoding March 27, 2025 | Deep dive into the MI300 compute and memory partition modes February 09, 2025 |
| Ganesh Dasika | 2 | Accelerating LLM Inference: Up to 3x Speedup on MI300X with Speculative Decoding March 27, 2025 | Deep dive into the MI300 compute and memory partition modes February 09, 2025 |
| Tiffany Mintz | 2 | Accelerating Parallel Programming in Python with Taichi Lang on AMD GPUs July 31, 2025 | Triton Inference Server with vLLM on AMD GPUs January 08, 2025 |
| David Björelind | 2 | Medical Imaging on MI300X: Optimized SwinUNETR for Tumor Detection October 07, 2025 | Optimizing Drug Discovery Tools on AMD MI300X Part 1: Molecular Design with REINVENT September 19, 2025 |
| Lin Sun | 2 | From Ingestion to Inference: RAG Pipelines on AMD GPUs October 02, 2025 | Coding Agents on AMD GPUs: Fast LLM Pipelines for Developers September 30, 2025 |
| Johanna Yang | 2 | Accelerating Audio-Driven Video Generation: WAN2.2-S2V on AMD ROCm September 24, 2025 | All-in-One Video Editing with VACE on AMD Instinct GPUs August 19, 2025 |
| Kristoffer Peyron | 2 | Accelerating Audio-Driven Video Generation: WAN2.2-S2V on AMD ROCm September 24, 2025 | All-in-One Video Editing with VACE on AMD Instinct GPUs August 19, 2025 |
| Wei Cai | 2 | Kimi-K2-Instruct: Enhanced Out-of-the-Box Performance on AMD Instinct MI355 Series GPUs October 16, 2025 | Step-Video-T2V Inference with xDiT on AMD Instinct MI300X GPUs May 15, 2025 |
| Fan Wu | 2 | Kimi-K2-Instruct: Enhanced Out-of-the-Box Performance on AMD Instinct MI355 Series GPUs October 16, 2025 | Unleash Full GPU Potential: Overlap Communication and Computation with Triton-Distributed May 06, 2025 |
| Rishi Madduri | 2 | Enabling FlashInfer on ROCm for Accelerated LLM Serving October 01, 2025 | Efficient MoE training on AMD ROCm: How-to use MegaBlocks on AMD GPUs March 23, 2025 |
| Michael Zhang | 2 | SGLang: Fast Serving Framework for Large Language and Vision-Language Models on AMD Instinct GPUs November 13, 2024 | CTranslate2: Efficient Inference with Transformer Models on AMD GPUs October 24, 2024 |
| Haoyang Li | 2 | High-Accuracy MXFP4, MXFP6, and Mixed-Precision Models on AMD GPUs October 29, 2025 | QuickReduce: Up to 3x Faster All-reduce for vLLM and SGLang August 26, 2025 |
| Ke Wang | 2 | High-Accuracy MXFP4, MXFP6, and Mixed-Precision Models on AMD GPUs October 29, 2025 | QuickReduce: Up to 3x Faster All-reduce for vLLM and SGLang August 26, 2025 |
| Hao Chen | 2 | Day-0 Support for the SGLang-Native RL Framework - slime on AMD Instinct™ GPUs September 25, 2025 | Instella-T2I: Open-Source Text-to-Image with 1D Tokenizer and 32× Token Reduction on AMD GPUs July 15, 2025 |
| Tong Shen | 2 | Nitro-E: A 304M Diffusion Transformer Model for High Quality Image Generation October 24, 2025 | Nitro-T: Training a Text-to-Image Diffusion Model from Scratch in 1 Day July 09, 2025 |
| Jingai Yu | 2 | Nitro-E: A 304M Diffusion Transformer Model for High Quality Image Generation October 24, 2025 | Nitro-T: Training a Text-to-Image Diffusion Model from Scratch in 1 Day July 09, 2025 |
| Sopiko Kurdadze | 2 | A Simple Design for Serving Video Generation Models with Distributed Inference September 24, 2025 | Accelerating FastVideo on AMD GPUs with TeaCache August 19, 2025 |
| Vasumathi Neralla | 2 | Medical Imaging on MI300X: Optimized SwinUNETR for Tumor Detection October 07, 2025 | Optimizing Drug Discovery Tools on AMD MI300X Part 2: 3D Molecular Generation with SemlaFlow October 03, 2025 |
| Albin Toft | 2 | A Simple Design for Serving Video Generation Models with Distributed Inference September 24, 2025 | Running ComfyUI on AMD Instinct August 19, 2025 |
| Uma Kannikanti | 2 | Technical Dive into AMD's MLPerf Inference v5.1 Submission September 09, 2025 | Reproducing the AMD Instinct™ GPUs MLPerf Inference v5.1 Submission September 09, 2025 |
| Neha Mathews | 2 | Technical Dive into AMD's MLPerf Inference v5.1 Submission September 09, 2025 | Reproducing the AMD Instinct™ GPUs MLPerf Inference v5.1 Submission September 09, 2025 |
| Bowen Bao | 2 | High-Accuracy MXFP4, MXFP6, and Mixed-Precision Models on AMD GPUs October 29, 2025 | Technical Dive into AMD's MLPerf Inference v5.1 Submission September 09, 2025 |
| Danny Guan | 2 | ROCm 7.0: An AI-Ready Powerhouse for Performance, Efficiency, and Productivity September 16, 2025 | ROCm Gets Modular: Meet the Instinct Datacenter GPU Driver April 11, 2025 |
| Aditya Bhattacharji | 2 | ROCm 7.0: An AI-Ready Powerhouse for Performance, Efficiency, and Productivity September 16, 2025 | ROCm 6.4: Breaking Barriers in AI, HPC, and Modular GPU Software April 11, 2025 |
| Maria Ruiz Varela | 2 | Application portability with HIP April 26, 2024 | AMD Instinct™ MI200 GPU memory space overview March 09, 2023 |
| Xiaobo Chen | 2 | An Introduction to Primus-Turbo: A Library for Accelerating Transformer Models on AMD GPUs September 19, 2025 | Primus: A Lightweight, Unified Training Framework for Large Models on AMD GPUs August 22, 2025 |
| Muhammad Osama | 2 | Deep dive into the MI300 compute and memory partition modes February 09, 2025 | Graph analytics on AMD GPUs using Gunrock July 29, 2024 |
| Damon McDougall | 2 | GPU-aware MPI with ROCm June 08, 2023 | AMD matrix cores November 14, 2022 |
| Noel Chalmers | 2 | GPU-aware MPI with ROCm June 08, 2023 | AMD matrix cores November 14, 2022 |
| Ammar Elwazir | 2 | Introducing ROCprofiler SDK - The Latest Toolkit for Performance Profiling March 25, 2025 | Introducing ROCprofiler SDK - The Latest Toolkit for Performance Profiling. March 25, 2025 |
| Haocong Wang | 2 | Avoiding LDS Bank Conflicts on AMD GPUs Using CK-Tile Framework July 25, 2025 | From Theory to Kernel: Implement FlashAttention-v2 with CK-Tile May 21, 2025 |
| Ben Sander | 2 | Measuring Max-Achievable FLOPs – Part 2 February 28, 2025 | Understanding Peak, Max-Achievable & Delivered FLOPs, Part 1 February 14, 2025 |
| Carlus Huang | 2 | Matrix Core Programming on AMD CDNA™3 and CDNA™4 architecture September 30, 2025 | AITER: AI Tensor Engine For ROCm March 21, 2025 |
| YangWen Huang | 2 | GEMM Tuning within hipBLASLt– Part 2 October 09, 2025 | GEMM Tuning within hipBLASLt - Part 1 September 05, 2025 |
| George Markomanolis | 2 | Affinity part 1 - Affinity, placement, and order April 16, 2024 | Affinity part 2 - System topology and controlling affinity April 16, 2024 |
| Chang Liu | 2 | Efficient LLM Serving with MTP: DeepSeek V3 and SGLang on AMD Instinct GPUs September 11, 2025 | Speculative Decoding - Deep Dive March 24, 2025 |
| Justin Chu | 1 | STX-B0T: Real-time AI Robot Assistant Powered by RyzenAI and ROCm October 23, 2025 | STX-B0T: Real-time AI Robot Assistant Powered by RyzenAI and ROCm October 23, 2025 |
| Yosi Hatekar | 1 | STX-B0T: Real-time AI Robot Assistant Powered by RyzenAI and ROCm October 23, 2025 | STX-B0T: Real-time AI Robot Assistant Powered by RyzenAI and ROCm October 23, 2025 |
| Vivian Cheng | 1 | STX-B0T: Real-time AI Robot Assistant Powered by RyzenAI and ROCm October 23, 2025 | STX-B0T: Real-time AI Robot Assistant Powered by RyzenAI and ROCm October 23, 2025 |
| Hyunji Kim | 1 | STX-B0T: Real-time AI Robot Assistant Powered by RyzenAI and ROCm October 23, 2025 | STX-B0T: Real-time AI Robot Assistant Powered by RyzenAI and ROCm October 23, 2025 |
| Alex Bogdan | 1 | STX-B0T: Real-time AI Robot Assistant Powered by RyzenAI and ROCm October 23, 2025 | STX-B0T: Real-time AI Robot Assistant Powered by RyzenAI and ROCm October 23, 2025 |
| Aditya Kumar Singh | 1 | Instella-VL-1B: First AMD Vision Language Model March 07, 2025 | Instella-VL-1B: First AMD Vision Language Model March 07, 2025 |
| Mehdi Rezagholizadeh | 1 | AMD-HybridLM: Towards Extremely Efficient Hybrid Language Models September 17, 2025 | AMD-HybridLM: Towards Extremely Efficient Hybrid Language Models September 17, 2025 |
| Mingyu Yang | 1 | AMD-HybridLM: Towards Extremely Efficient Hybrid Language Models September 17, 2025 | AMD-HybridLM: Towards Extremely Efficient Hybrid Language Models September 17, 2025 |
| Guihong Li | 1 | AMD-HybridLM: Towards Extremely Efficient Hybrid Language Models September 17, 2025 | AMD-HybridLM: Towards Extremely Efficient Hybrid Language Models September 17, 2025 |
| Shenrun Zhang | 1 | Accelerating LLM Inference: Up to 3x Speedup on MI300X with Speculative Decoding March 27, 2025 | Accelerating LLM Inference: Up to 3x Speedup on MI300X with Speculative Decoding March 27, 2025 |
| Bill He | 1 | Power Up Qwen 3 with AMD Instinct: A Developer’s Day 0 Quickstart April 28, 2025 | Power Up Qwen 3 with AMD Instinct: A Developer’s Day 0 Quickstart April 28, 2025 |
| Kenny Roche | 1 | AMD Integrates llm-d on AMD Instinct MI300X Cluster For Distributed LLM Serving May 20, 2025 | AMD Integrates llm-d on AMD Instinct MI300X Cluster For Distributed LLM Serving May 20, 2025 |
| Joe Shajrawi | 1 | AMD Integrates llm-d on AMD Instinct MI300X Cluster For Distributed LLM Serving May 20, 2025 | AMD Integrates llm-d on AMD Instinct MI300X Cluster For Distributed LLM Serving May 20, 2025 |
| AMD Quark Team | 1 | AMD Instinct™ MI325X GPUs Produce Strong Performance in MLPerf Inference v5.0 April 02, 2025 | AMD Instinct™ MI325X GPUs Produce Strong Performance in MLPerf Inference v5.0 April 02, 2025 |
| AMD Brevitas Team | 1 | AMD Instinct™ MI325X GPUs Produce Strong Performance in MLPerf Inference v5.0 April 02, 2025 | AMD Instinct™ MI325X GPUs Produce Strong Performance in MLPerf Inference v5.0 April 02, 2025 |
| and AMD Shark Team | 1 | AMD Instinct™ MI325X GPUs Produce Strong Performance in MLPerf Inference v5.0 April 02, 2025 | AMD Instinct™ MI325X GPUs Produce Strong Performance in MLPerf Inference v5.0 April 02, 2025 |
| Haohui Mai | 1 | Optimizing FP4 Mixed-Precision Inference with Petit on AMD Instinct MI250 and MI300 GPUs: A Developer’s Perspective October 06, 2025 | Optimizing FP4 Mixed-Precision Inference with Petit on AMD Instinct MI250 and MI300 GPUs: A Developer’s Perspective October 06, 2025 |
| Chandan Sharma | 1 | Announcing MONAI 1.0.0 for AMD ROCm: Breakthrough AI Acceleration for Medical Imaging Models on AMD Instinct™ GPUs October 07, 2025 | Announcing MONAI 1.0.0 for AMD ROCm: Breakthrough AI Acceleration for Medical Imaging Models on AMD Instinct™ GPUs October 07, 2025 |
| Hui Liu | 1 | SGLang: Fast Serving Framework for Large Language and Vision-Language Models on AMD Instinct GPUs November 13, 2024 | SGLang: Fast Serving Framework for Large Language and Vision-Language Models on AMD Instinct GPUs November 13, 2024 |
| Yineng Zhang | 1 | SGLang: Fast Serving Framework for Large Language and Vision-Language Models on AMD Instinct GPUs November 13, 2024 | SGLang: Fast Serving Framework for Large Language and Vision-Language Models on AMD Instinct GPUs November 13, 2024 |
| Jiangyong Ren | 1 | QuickReduce: Up to 3x Faster All-reduce for vLLM and SGLang August 26, 2025 | QuickReduce: Up to 3x Faster All-reduce for vLLM and SGLang August 26, 2025 |
| Doug Lehr | 1 | QuickReduce: Up to 3x Faster All-reduce for vLLM and SGLang August 26, 2025 | QuickReduce: Up to 3x Faster All-reduce for vLLM and SGLang August 26, 2025 |
| Luise Chen | 1 | Inferencing with Grok-1 on AMD GPUs August 09, 2024 | Inferencing with Grok-1 on AMD GPUs August 09, 2024 |
| Lei Shao | 1 | Inferencing with Grok-1 on AMD GPUs August 09, 2024 | Inferencing with Grok-1 on AMD GPUs August 09, 2024 |
| Yuzhen Zhou | 1 | Day-0 Support for the SGLang-Native RL Framework - slime on AMD Instinct™ GPUs September 25, 2025 | Day-0 Support for the SGLang-Native RL Framework - slime on AMD Instinct™ GPUs September 25, 2025 |
| Jin Pan | 1 | Day-0 Support for the SGLang-Native RL Framework - slime on AMD Instinct™ GPUs September 25, 2025 | Day-0 Support for the SGLang-Native RL Framework - slime on AMD Instinct™ GPUs September 25, 2025 |
| Dong Zhou | 1 | Nitro-E: A 304M Diffusion Transformer Model for High Quality Image Generation October 24, 2025 | Nitro-E: A 304M Diffusion Transformer Model for High Quality Image Generation October 24, 2025 |
| James E. T. Smith | 1 | DGL in the Real World: Running GNNs on Real Use Cases August 20, 2025 | DGL in the Real World: Running GNNs on Real Use Cases August 20, 2025 |
| Rasmus Larsson | 1 | Retrieval Augmented Generation (RAG) with vLLM, LangChain and Chroma November 04, 2025 | Retrieval Augmented Generation (RAG) with vLLM, LangChain and Chroma November 04, 2025 |
| Emelie Wahlstrom | 1 | Retrieval Augmented Generation (RAG) with vLLM, LangChain and Chroma November 04, 2025 | Retrieval Augmented Generation (RAG) with vLLM, LangChain and Chroma November 04, 2025 |
| Saroosh Shabbir | 1 | Retrieval Augmented Generation (RAG) with vLLM, LangChain and Chroma November 04, 2025 | Retrieval Augmented Generation (RAG) with vLLM, LangChain and Chroma November 04, 2025 |
| Elaine Zosa | 1 | Continued Pretraining: A Practical Playbook for Language-Specific LLM Adaptation June 18, 2025 | Continued Pretraining: A Practical Playbook for Language-Specific LLM Adaptation June 18, 2025 |
| Jouni Luoma | 1 | Continued Pretraining: A Practical Playbook for Language-Specific LLM Adaptation June 18, 2025 | Continued Pretraining: A Practical Playbook for Language-Specific LLM Adaptation June 18, 2025 |
| Kai Hakala | 1 | Continued Pretraining: A Practical Playbook for Language-Specific LLM Adaptation June 18, 2025 | Continued Pretraining: A Practical Playbook for Language-Specific LLM Adaptation June 18, 2025 |
| Antti Virtanen | 1 | Continued Pretraining: A Practical Playbook for Language-Specific LLM Adaptation June 18, 2025 | Continued Pretraining: A Practical Playbook for Language-Specific LLM Adaptation June 18, 2025 |
| Mika Koistinen | 1 | Continued Pretraining: A Practical Playbook for Language-Specific LLM Adaptation June 18, 2025 | Continued Pretraining: A Practical Playbook for Language-Specific LLM Adaptation June 18, 2025 |
| Jonathan Burdge | 1 | Continued Pretraining: A Practical Playbook for Language-Specific LLM Adaptation June 18, 2025 | Continued Pretraining: A Practical Playbook for Language-Specific LLM Adaptation June 18, 2025 |
| Takashi Isobe | 1 | AMD Hummingbird Image to Video: A Lightweight Feedback-Driven Model for Efficient Image-to-Video Generation August 03, 2025 | AMD Hummingbird Image to Video: A Lightweight Feedback-Driven Model for Efficient Image-to-Video Generation August 03, 2025 |
| Dong zhou | 1 | AMD Hummingbird Image to Video: A Lightweight Feedback-Driven Model for Efficient Image-to-Video Generation August 03, 2025 | AMD Hummingbird Image to Video: A Lightweight Feedback-Driven Model for Efficient Image-to-Video Generation August 03, 2025 |
| He Cui,Mengmeng Ge,Dong Li,Emad Barsoum | 1 | AMD Hummingbird Image to Video: A Lightweight Feedback-Driven Model for Efficient Image-to-Video Generation August 03, 2025 | AMD Hummingbird Image to Video: A Lightweight Feedback-Driven Model for Efficient Image-to-Video Generation August 03, 2025 |
| Jorge Parada | 1 | Scale LLM Inference with Multi-Node Infrastructure May 30, 2025 | Scale LLM Inference with Multi-Node Infrastructure May 30, 2025 |
| Jeremy Arnold | 1 | Benchmarking Machine Learning using ROCm and AMD GPUs: Reproducing Our MLPerf Inference Submission August 28, 2024 | Benchmarking Machine Learning using ROCm and AMD GPUs: Reproducing Our MLPerf Inference Submission August 28, 2024 |
| Jassani Adeem | 1 | Mamba on AMD GPUs with ROCm June 28, 2024 | Mamba on AMD GPUs with ROCm June 28, 2024 |
| Moskvichev Arseny | 1 | Mamba on AMD GPUs with ROCm June 28, 2024 | Mamba on AMD GPUs with ROCm June 28, 2024 |
| Akash Haridas | 1 | Nitro-T: Training a Text-to-Image Diffusion Model from Scratch in 1 Day July 09, 2025 | Nitro-T: Training a Text-to-Image Diffusion Model from Scratch in 1 Day July 09, 2025 |
| Nick Romero | 1 | Empowering Developers to Build a Robust PyTorch Ecosystem on AMD ROCm™ with Better Insights and Monitoring October 21, 2025 | Empowering Developers to Build a Robust PyTorch Ecosystem on AMD ROCm™ with Better Insights and Monitoring October 21, 2025 |
| Jeff Daily | 1 | Empowering Developers to Build a Robust PyTorch Ecosystem on AMD ROCm™ with Better Insights and Monitoring October 21, 2025 | Empowering Developers to Build a Robust PyTorch Ecosystem on AMD ROCm™ with Better Insights and Monitoring October 21, 2025 |
| Jithun Nair | 1 | Empowering Developers to Build a Robust PyTorch Ecosystem on AMD ROCm™ with Better Insights and Monitoring October 21, 2025 | Empowering Developers to Build a Robust PyTorch Ecosystem on AMD ROCm™ with Better Insights and Monitoring October 21, 2025 |
| Pruthvi Madugundu | 1 | Empowering Developers to Build a Robust PyTorch Ecosystem on AMD ROCm™ with Better Insights and Monitoring October 21, 2025 | Empowering Developers to Build a Robust PyTorch Ecosystem on AMD ROCm™ with Better Insights and Monitoring October 21, 2025 |
| Jagadish Krishnamoorthy | 1 | Empowering Developers to Build a Robust PyTorch Ecosystem on AMD ROCm™ with Better Insights and Monitoring October 21, 2025 | Empowering Developers to Build a Robust PyTorch Ecosystem on AMD ROCm™ with Better Insights and Monitoring October 21, 2025 |
| Srinivasan Subramanian | 1 | Empowering Developers to Build a Robust PyTorch Ecosystem on AMD ROCm™ with Better Insights and Monitoring October 21, 2025 | Empowering Developers to Build a Robust PyTorch Ecosystem on AMD ROCm™ with Better Insights and Monitoring October 21, 2025 |
| Eli Uriegas | 1 | Empowering Developers to Build a Robust PyTorch Ecosystem on AMD ROCm™ with Better Insights and Monitoring October 21, 2025 | Empowering Developers to Build a Robust PyTorch Ecosystem on AMD ROCm™ with Better Insights and Monitoring October 21, 2025 |
| Giuseppe Franco | 1 | Reproducing the AMD Instinct™ GPUs MLPerf Inference v5.0 Submission April 02, 2025 | Reproducing the AMD Instinct™ GPUs MLPerf Inference v5.0 Submission April 02, 2025 |
| AMD Quark team | 1 | Reproducing the AMD Instinct™ GPUs MLPerf Inference v5.0 Submission April 02, 2025 | Reproducing the AMD Instinct™ GPUs MLPerf Inference v5.0 Submission April 02, 2025 |
| Arttu Niemela | 1 | Wan2.2 Fine-Tuning: Tailoring an Advanced Video Generation Model on a Single GPU August 19, 2025 | Wan2.2 Fine-Tuning: Tailoring an Advanced Video Generation Model on a Single GPU August 19, 2025 |
| Balazs Toth | 1 | Wan2.2 Fine-Tuning: Tailoring an Advanced Video Generation Model on a Single GPU August 19, 2025 | Wan2.2 Fine-Tuning: Tailoring an Advanced Video Generation Model on a Single GPU August 19, 2025 |
| Joaquin Rives Gambin | 1 | Medical Imaging on MI300X: Optimized SwinUNETR for Tumor Detection October 07, 2025 | Medical Imaging on MI300X: Optimized SwinUNETR for Tumor Detection October 07, 2025 |
| Rajesh Poornachandran | 1 | Technical Dive into AMD's MLPerf Inference v5.1 Submission September 09, 2025 | Technical Dive into AMD's MLPerf Inference v5.1 Submission September 09, 2025 |
| Zhao Lin | 1 | Technical Dive into AMD's MLPerf Inference v5.1 Submission September 09, 2025 | Technical Dive into AMD's MLPerf Inference v5.1 Submission September 09, 2025 |
| Niels Zhang | 1 | Technical Dive into AMD's MLPerf Inference v5.1 Submission September 09, 2025 | Technical Dive into AMD's MLPerf Inference v5.1 Submission September 09, 2025 |
| Vinayak Gokhale | 1 | Technical Dive into AMD's MLPerf Inference v5.1 Submission September 09, 2025 | Technical Dive into AMD's MLPerf Inference v5.1 Submission September 09, 2025 |
| Han Lin | 1 | Day 0 Developer Guide: hipBLASLt Offline GEMM Tuning Script November 05, 2025 | Day 0 Developer Guide: hipBLASLt Offline GEMM Tuning Script November 05, 2025 |
| Chao Li | 1 | Day 0 Developer Guide: hipBLASLt Offline GEMM Tuning Script November 05, 2025 | Day 0 Developer Guide: hipBLASLt Offline GEMM Tuning Script November 05, 2025 |
| Junyan Yang | 1 | Day 0 Developer Guide: hipBLASLt Offline GEMM Tuning Script November 05, 2025 | Day 0 Developer Guide: hipBLASLt Offline GEMM Tuning Script November 05, 2025 |
| Chunhung Wang | 1 | Day 0 Developer Guide: hipBLASLt Offline GEMM Tuning Script November 05, 2025 | Day 0 Developer Guide: hipBLASLt Offline GEMM Tuning Script November 05, 2025 |
| Zhenhua Liu | 1 | Introducing AMD EVLM: Efficient Vision-Language Models with Parameter-Space Visual Conditioning August 22, 2025 | Introducing AMD EVLM: Efficient Vision-Language Models with Parameter-Space Visual Conditioning August 22, 2025 |
| Bruce Xue | 1 | Accelerate DeepSeek-R1 Inference: Integrate AITER into SGLang May 16, 2025 | Accelerate DeepSeek-R1 Inference: Integrate AITER into SGLang May 16, 2025 |
| Rathnakara Malatesha | 1 | Deploying Serverless AI Inference on AMD GPU Clusters February 25, 2025 | Deploying Serverless AI Inference on AMD GPU Clusters February 25, 2025 |
| Abby O'Neill | 1 | Fine-tuning Robotics Vision Language Action Models with AMD ROCm and LeRobot July 14, 2025 | Fine-tuning Robotics Vision Language Action Models with AMD ROCm and LeRobot July 14, 2025 |
| Sarunas Kalade | 1 | Fine-tuning Robotics Vision Language Action Models with AMD ROCm and LeRobot July 14, 2025 | Fine-tuning Robotics Vision Language Action Models with AMD ROCm and LeRobot July 14, 2025 |
| Ken O'Brien | 1 | Fine-tuning Robotics Vision Language Action Models with AMD ROCm and LeRobot July 14, 2025 | Fine-tuning Robotics Vision Language Action Models with AMD ROCm and LeRobot July 14, 2025 |
| Graham Schelle | 1 | Fine-tuning Robotics Vision Language Action Models with AMD ROCm and LeRobot July 14, 2025 | Fine-tuning Robotics Vision Language Action Models with AMD ROCm and LeRobot July 14, 2025 |
| Chaitanya Manem | 1 | Introducing Instella: New State-of-the-art Fully Open 3B Language Models March 05, 2025 | Introducing Instella: New State-of-the-art Fully Open 3B Language Models March 05, 2025 |
| Luka Tsabadze | 1 | Running SOTA AI-based Weather Forecasting models on AMD Instinct September 18, 2025 | Running SOTA AI-based Weather Forecasting models on AMD Instinct September 18, 2025 |
| Rahul Biswas | 1 | Running SOTA AI-based Weather Forecasting models on AMD Instinct September 18, 2025 | Running SOTA AI-based Weather Forecasting models on AMD Instinct September 18, 2025 |
| Pauli Pihajoki | 1 | Running SOTA AI-based Weather Forecasting models on AMD Instinct September 18, 2025 | Running SOTA AI-based Weather Forecasting models on AMD Instinct September 18, 2025 |
| Daniel Warna | 1 | Running SOTA AI-based Weather Forecasting models on AMD Instinct September 18, 2025 | Running SOTA AI-based Weather Forecasting models on AMD Instinct September 18, 2025 |
| Baiqiang Xia | 1 | Running SOTA AI-based Weather Forecasting models on AMD Instinct September 18, 2025 | Running SOTA AI-based Weather Forecasting models on AMD Instinct September 18, 2025 |
| Ted Themistokleous | 1 | Triton Inference Server with vLLM on AMD GPUs January 08, 2025 | Triton Inference Server with vLLM on AMD GPUs January 08, 2025 |
| Brian Pickrell | 1 | Triton Inference Server with vLLM on AMD GPUs January 08, 2025 | Triton Inference Server with vLLM on AMD GPUs January 08, 2025 |
| Diptorup Deb | 1 | Enabling FlashInfer on ROCm for Accelerated LLM Serving October 01, 2025 | Enabling FlashInfer on ROCm for Accelerated LLM Serving October 01, 2025 |
| Debasis Mandal | 1 | Enabling FlashInfer on ROCm for Accelerated LLM Serving October 01, 2025 | Enabling FlashInfer on ROCm for Accelerated LLM Serving October 01, 2025 |
| Eduardo Alvarez | 1 | Analyzing the Impact of Tensor Parallelism Configurations on LLM Inference Performance March 14, 2025 | Analyzing the Impact of Tensor Parallelism Configurations on LLM Inference Performance March 14, 2025 |
| Yu Wang | 1 | AMD Advances Enterprise AI Through OPEA Integration March 12, 2025 | AMD Advances Enterprise AI Through OPEA Integration March 12, 2025 |
| Yamini Kamisetty | 1 | Reproducing the AMD Instinct™ GPUs MLPerf Inference v5.1 Submission September 09, 2025 | Reproducing the AMD Instinct™ GPUs MLPerf Inference v5.1 Submission September 09, 2025 |
| Chelsea Iluno | 1 | Reproducing the AMD Instinct™ GPUs MLPerf Inference v5.1 Submission September 09, 2025 | Reproducing the AMD Instinct™ GPUs MLPerf Inference v5.1 Submission September 09, 2025 |
| Benran Hu | 1 | Instella-T2I: Open-Source Text-to-Image with 1D Tokenizer and 32× Token Reduction on AMD GPUs July 15, 2025 | Instella-T2I: Open-Source Text-to-Image with 1D Tokenizer and 32× Token Reduction on AMD GPUs July 15, 2025 |
| Marilyn Basanta | 1 | ROCm 7.0: An AI-Ready Powerhouse for Performance, Efficiency, and Productivity September 16, 2025 | ROCm 7.0: An AI-Ready Powerhouse for Performance, Efficiency, and Productivity September 16, 2025 |
| Ronnie Chatterjee | 1 | ROCm 6.4: Breaking Barriers in AI, HPC, and Modular GPU Software April 11, 2025 | ROCm 6.4: Breaking Barriers in AI, HPC, and Modular GPU Software April 11, 2025 |
| Christophe Paquot | 1 | HIP 7.0 Is Coming: What You Need to Know to Stay Ahead May 28, 2025 | HIP 7.0 Is Coming: What You Need to Know to Stay Ahead May 28, 2025 |
| Julia Jiang | 1 | HIP 7.0 Is Coming: What You Need to Know to Stay Ahead May 28, 2025 | HIP 7.0 Is Coming: What You Need to Know to Stay Ahead May 28, 2025 |
| Denny Iriawan | 1 | HIP 7.0 Is Coming: What You Need to Know to Stay Ahead May 28, 2025 | HIP 7.0 Is Coming: What You Need to Know to Stay Ahead May 28, 2025 |
| Brian Cornille | 1 | Introducing AMD's Next-Gen Fortran Compiler November 13, 2024 | Introducing AMD's Next-Gen Fortran Compiler November 13, 2024 |
| Michael Klemm | 1 | Introducing AMD's Next-Gen Fortran Compiler November 13, 2024 | Introducing AMD's Next-Gen Fortran Compiler November 13, 2024 |
| Johanna Potyka | 1 | Introducing AMD's Next-Gen Fortran Compiler November 13, 2024 | Introducing AMD's Next-Gen Fortran Compiler November 13, 2024 |
| Martin Huarte | 1 | Boosting Computational Fluid Dynamics Performance with AMD Instinct™ MI300X January 14, 2025 | Boosting Computational Fluid Dynamics Performance with AMD Instinct™ MI300X January 14, 2025 |
| Quentin Anthony | 1 | Training Transformers and Hybrid models on AMD Instinct MI300X Accelerators December 10, 2024 | Training Transformers and Hybrid models on AMD Instinct MI300X Accelerators December 10, 2024 |
| Niles Burbank | 1 | Day 0 Developer Guide: Running the Latest Open Models from OpenAI on AMD AI Hardware August 05, 2025 | Day 0 Developer Guide: Running the Latest Open Models from OpenAI on AMD AI Hardware August 05, 2025 |
| Kailash Gogineni,Xun Wang | 1 | Day 0 Developer Guide: Running the Latest Open Models from OpenAI on AMD AI Hardware August 05, 2025 | Day 0 Developer Guide: Running the Latest Open Models from OpenAI on AMD AI Hardware August 05, 2025 |
| Yanyuan Qin | 1 | Day 0 Developer Guide: Running the Latest Open Models from OpenAI on AMD AI Hardware August 05, 2025 | Day 0 Developer Guide: Running the Latest Open Models from OpenAI on AMD AI Hardware August 05, 2025 |
| Deepan Sekar | 1 | Llama.cpp Meets Instinct: A New Era of Open-Source AI Acceleration September 09, 2025 | Llama.cpp Meets Instinct: A New Era of Open-Source AI Acceleration September 09, 2025 |
| Pei Zhang | 1 | Llama.cpp Meets Instinct: A New Era of Open-Source AI Acceleration September 09, 2025 | Llama.cpp Meets Instinct: A New Era of Open-Source AI Acceleration September 09, 2025 |
| Hyukjoon Lee | 1 | vLLM V1 Meets AMD Instinct GPUs: A New Era for LLM Inference Performance July 07, 2025 | vLLM V1 Meets AMD Instinct GPUs: A New Era for LLM Inference Performance July 07, 2025 |
| Janet Tseng | 1 | ROCm 7.9 Technology Preview: ROCm Core SDK and TheRock Build System October 20, 2025 | ROCm 7.9 Technology Preview: ROCm Core SDK and TheRock Build System October 20, 2025 |
| Scott Todd | 1 | ROCm 7.9 Technology Preview: ROCm Core SDK and TheRock Build System October 20, 2025 | ROCm 7.9 Technology Preview: ROCm Core SDK and TheRock Build System October 20, 2025 |
| Chris Sosa | 1 | ROCm 7.9 Technology Preview: ROCm Core SDK and TheRock Build System October 20, 2025 | ROCm 7.9 Technology Preview: ROCm Core SDK and TheRock Build System October 20, 2025 |
| Wen Xie ,Yao Fu | 1 | Primus: A Lightweight, Unified Training Framework for Large Models on AMD GPUs August 22, 2025 | Primus: A Lightweight, Unified Training Framework for Large Models on AMD GPUs August 22, 2025 |
| Xiaoming Peng | 1 | Primus: A Lightweight, Unified Training Framework for Large Models on AMD GPUs August 22, 2025 | Primus: A Lightweight, Unified Training Framework for Large Models on AMD GPUs August 22, 2025 |
| Vidushi Goyal | 1 | Primus: A Lightweight, Unified Training Framework for Large Models on AMD GPUs August 22, 2025 | Primus: A Lightweight, Unified Training Framework for Large Models on AMD GPUs August 22, 2025 |
| Nicholas Curtis | 1 | Register pressure in AMD CDNA™2 GPUs May 17, 2023 | Register pressure in AMD CDNA™2 GPUs May 17, 2023 |
| Lin Zhao | 1 | High-Accuracy MXFP4, MXFP6, and Mixed-Precision Models on AMD GPUs October 29, 2025 | High-Accuracy MXFP4, MXFP6, and Mixed-Precision Models on AMD GPUs October 29, 2025 |
| Felix Marty | 1 | High-Accuracy MXFP4, MXFP6, and Mixed-Precision Models on AMD GPUs October 29, 2025 | High-Accuracy MXFP4, MXFP6, and Mixed-Precision Models on AMD GPUs October 29, 2025 |
| Zhaofeng Zhang | 1 | High-Accuracy MXFP4, MXFP6, and Mixed-Precision Models on AMD GPUs October 29, 2025 | High-Accuracy MXFP4, MXFP6, and Mixed-Precision Models on AMD GPUs October 29, 2025 |
| Rajneesh Bhardwaj | 1 | Deep dive into the MI300 compute and memory partition modes February 09, 2025 | Deep dive into the MI300 compute and memory partition modes February 09, 2025 |
| Anton Smirnov | 1 | Programming AMD GPUs with Julia April 16, 2024 | Programming AMD GPUs with Julia April 16, 2024 |
| Matthias Reso | 1 | Chain-of-Thought Guided Visual Reasoning Using Llama 3.2 on a Single AMD Instinct MI300X GPU July 21, 2025 | Chain-of-Thought Guided Visual Reasoning Using Llama 3.2 on a Single AMD Instinct MI300X GPU July 21, 2025 |
| Mohammad Mahdi Kamani | 1 | Beyond Text: Accelerating Multimodal AI Inference with Speculative Decoding on AMD Instinct™ MI300X GPUs April 28, 2025 | Beyond Text: Accelerating Multimodal AI Inference with Speculative Decoding on AMD Instinct™ MI300X GPUs April 28, 2025 |
| Parsa Fashi | 1 | Beyond Text: Accelerating Multimodal AI Inference with Speculative Decoding on AMD Instinct™ MI300X GPUs April 28, 2025 | Beyond Text: Accelerating Multimodal AI Inference with Speculative Decoding on AMD Instinct™ MI300X GPUs April 28, 2025 |
| David Doscher | 1 | AMD ROCm™ installation January 26, 2023 | AMD ROCm™ installation January 26, 2023 |
| Rene Van Oostrum | 1 | AMD matrix cores November 14, 2022 | AMD matrix cores November 14, 2022 |
| Nicholas Malaya | 1 | AMD matrix cores November 14, 2022 | AMD matrix cores November 14, 2022 |
| Daniel Huang | 1 | AITER-Enabled MLA Layer Inference on AMD Instinct MI300X GPUs August 25, 2025 | AITER-Enabled MLA Layer Inference on AMD Instinct MI300X GPUs August 25, 2025 |
| Yao Fehlis | 1 | Creating a PyTorch/TensorFlow code environment on AMD GPUs September 11, 2023 | Creating a PyTorch/TensorFlow code environment on AMD GPUs September 11, 2023 |
| Warren Eng | 1 | Running ComfyUI in Windows with ROCm on WSL August 07, 2025 | Running ComfyUI in Windows with ROCm on WSL August 07, 2025 |
| Wen Xie | 1 | An Introduction to Primus-Turbo: A Library for Accelerating Transformer Models on AMD GPUs September 19, 2025 | An Introduction to Primus-Turbo: A Library for Accelerating Transformer Models on AMD GPUs September 19, 2025 |
| Clement Lin | 1 | Avoiding LDS Bank Conflicts on AMD GPUs Using CK-Tile Framework July 25, 2025 | Avoiding LDS Bank Conflicts on AMD GPUs Using CK-Tile Framework July 25, 2025 |
| Meng-Hsuan Yang | 1 | Avoiding LDS Bank Conflicts on AMD GPUs Using CK-Tile Framework July 25, 2025 | Avoiding LDS Bank Conflicts on AMD GPUs Using CK-Tile Framework July 25, 2025 |
| Yu-Chen Lin | 1 | Avoiding LDS Bank Conflicts on AMD GPUs Using CK-Tile Framework July 25, 2025 | Avoiding LDS Bank Conflicts on AMD GPUs Using CK-Tile Framework July 25, 2025 |
| Bobo Fang | 1 | Avoiding LDS Bank Conflicts on AMD GPUs Using CK-Tile Framework July 25, 2025 | Avoiding LDS Bank Conflicts on AMD GPUs Using CK-Tile Framework July 25, 2025 |
| Chun-Hung Wang | 1 | Avoiding LDS Bank Conflicts on AMD GPUs Using CK-Tile Framework July 25, 2025 | Avoiding LDS Bank Conflicts on AMD GPUs Using CK-Tile Framework July 25, 2025 |
| Noah Wolfe | 1 | Introduction to profiling tools for AMD hardware April 12, 2023 | Introduction to profiling tools for AMD hardware April 12, 2023 |
| Cheng Ling | 1 | SmoothQuant model inference on AMD Instinct MI300X using Composable Kernel May 31, 2024 | SmoothQuant model inference on AMD Instinct MI300X using Composable Kernel May 31, 2024 |
| Pedram Alizadeh | 1 | Understanding RCCL Bandwidth and xGMI Performance on AMD Instinct™ MI300X March 02, 2025 | Understanding RCCL Bandwidth and xGMI Performance on AMD Instinct™ MI300X March 02, 2025 |
| Gilbert Lee | 1 | Understanding RCCL Bandwidth and xGMI Performance on AMD Instinct™ MI300X March 02, 2025 | Understanding RCCL Bandwidth and xGMI Performance on AMD Instinct™ MI300X March 02, 2025 |
| Douglas Hamilton | 1 | ROCm Runfile Installer Is Here! May 22, 2025 | ROCm Runfile Installer Is Here! May 22, 2025 |
| Lei Zhang | 1 | Unleash Full GPU Potential: Overlap Communication and Computation with Triton-Distributed May 06, 2025 | Unleash Full GPU Potential: Overlap Communication and Computation with Triton-Distributed May 06, 2025 |
| Kyle Wang | 1 | Unleash Full GPU Potential: Overlap Communication and Computation with Triton-Distributed May 06, 2025 | Unleash Full GPU Potential: Overlap Communication and Computation with Triton-Distributed May 06, 2025 |
| Lingpeng Jin | 1 | AITER: AI Tensor Engine For ROCm March 21, 2025 | AITER: AI Tensor Engine For ROCm March 21, 2025 |
| Bill He,Andy Luo | 1 | Unleashing AMD Instinct™ MI300X GPUs for LLM Serving: Disaggregating Prefill & Decode with SGLang August 28, 2025 | Unleashing AMD Instinct™ MI300X GPUs for LLM Serving: Disaggregating Prefill & Decode with SGLang August 28, 2025 |
| Chia Hung | 1 | GEMM Tuning within hipBLASLt– Part 2 October 09, 2025 | GEMM Tuning within hipBLASLt– Part 2 October 09, 2025 |
| Kevin Chang | 1 | From Theory to Kernel: Implement FlashAttention-v2 with CK-Tile May 21, 2025 | From Theory to Kernel: Implement FlashAttention-v2 with CK-Tile May 21, 2025 |
| Garrett Byrd | 1 | Installing ROCm from source with Spack April 14, 2025 | Installing ROCm from source with Spack April 14, 2025 |
| Joseph Schoonover | 1 | Installing ROCm from source with Spack April 14, 2025 | Installing ROCm from source with Spack April 14, 2025 |
| Mark Granroth-Wilding | 1 | Elevating 3D Scene Rendering with GSplat October 03, 2025 | Elevating 3D Scene Rendering with GSplat October 03, 2025 |
| Pier Luigi Dovesi | 1 | Elevating 3D Scene Rendering with GSplat October 03, 2025 | Elevating 3D Scene Rendering with GSplat October 03, 2025 |
| Shaghayegh Roohi | 1 | Elevating 3D Scene Rendering with GSplat October 03, 2025 | Elevating 3D Scene Rendering with GSplat October 03, 2025 |
| Amanzhol Salykov | 1 | Matrix Core Programming on AMD CDNA™3 and CDNA™4 architecture September 30, 2025 | Matrix Core Programming on AMD CDNA™3 and CDNA™4 architecture September 30, 2025 |
| Jinze Li | 1 | Gumiho: A New Paradigm for Speculative Decoding — Earlier Tokens in a Draft Sequence Matter More October 14, 2025 | Gumiho: A New Paradigm for Speculative Decoding — Earlier Tokens in a Draft Sequence Matter More October 14, 2025 |
| Abhishek Patil | 1 | Unlocking GPU-Accelerated Containers with the AMD Container Toolkit July 03, 2025 | Unlocking GPU-Accelerated Containers with the AMD Container Toolkit July 03, 2025 |
| Mahdieh Ghazimirsaeed | 1 | GPU-aware MPI with ROCm June 08, 2023 | GPU-aware MPI with ROCm June 08, 2023 |
| Evan Masters | 1 | Measuring Max-Achievable FLOPs – Part 2 February 28, 2025 | Measuring Max-Achievable FLOPs – Part 2 February 28, 2025 |
| Babak Poursartip | 1 | Measuring Max-Achievable FLOPs – Part 2 February 28, 2025 | Measuring Max-Achievable FLOPs – Part 2 February 28, 2025 |
| Henry Ho | 1 | Measuring Max-Achievable FLOPs – Part 2 February 28, 2025 | Measuring Max-Achievable FLOPs – Part 2 February 28, 2025 |
| Jianghui Wang | 1 | GEAK: Introducing Triton Kernel AI Agent & Evaluation Benchmarks August 01, 2025 | GEAK: Introducing Triton Kernel AI Agent & Evaluation Benchmarks August 01, 2025 |
| Vinay Joshi | 1 | GEAK: Introducing Triton Kernel AI Agent & Evaluation Benchmarks August 01, 2025 | GEAK: Introducing Triton Kernel AI Agent & Evaluation Benchmarks August 01, 2025 |
| Saptarshi Majumder | 1 | GEAK: Introducing Triton Kernel AI Agent & Evaluation Benchmarks August 01, 2025 | GEAK: Introducing Triton Kernel AI Agent & Evaluation Benchmarks August 01, 2025 |
| Chao Xu | 1 | GEAK: Introducing Triton Kernel AI Agent & Evaluation Benchmarks August 01, 2025 | GEAK: Introducing Triton Kernel AI Agent & Evaluation Benchmarks August 01, 2025 |
| Bin Ding | 1 | GEAK: Introducing Triton Kernel AI Agent & Evaluation Benchmarks August 01, 2025 | GEAK: Introducing Triton Kernel AI Agent & Evaluation Benchmarks August 01, 2025 |
| Ziqiong Liu | 1 | GEAK: Introducing Triton Kernel AI Agent & Evaluation Benchmarks August 01, 2025 | GEAK: Introducing Triton Kernel AI Agent & Evaluation Benchmarks August 01, 2025 |
| Chaojun Hou | 1 | Stability at Scale: AMD’s Full‑Stack Platform for Large‑Model Training November 04, 2025 | Stability at Scale: AMD’s Full‑Stack Platform for Large‑Model Training November 04, 2025 |
| Lei Wei | 1 | Stability at Scale: AMD’s Full‑Stack Platform for Large‑Model Training November 04, 2025 | Stability at Scale: AMD’s Full‑Stack Platform for Large‑Model Training November 04, 2025 |
| Alireza Sariaslani | 1 | GPU Partitioning Made Easy: Pack More AI Workloads Using AMD GPU Operator October 01, 2025 | GPU Partitioning Made Easy: Pack More AI Workloads Using AMD GPU Operator October 01, 2025 |
| Zhu Shan | 1 | Fine-Tuning LLMs with GRPO on AMD MI300X: Scalable RLHF with Hugging Face TRL and ROCm June 18, 2025 | Fine-Tuning LLMs with GRPO on AMD MI300X: Scalable RLHF with Hugging Face TRL and ROCm June 18, 2025 |
| Corbin Robeck | 1 | Reading AMD GPU ISA May 13, 2024 | Reading AMD GPU ISA May 13, 2024 |
| Tun Jian Tan | 1 | Accelerated LLM Inference on AMD Instinct™ GPUs with vLLM 0.9.x and ROCm June 28, 2025 | Accelerated LLM Inference on AMD Instinct™ GPUs with vLLM 0.9.x and ROCm June 28, 2025 |
| Pin Siang Tan | 1 | Accelerated LLM Inference on AMD Instinct™ GPUs with vLLM 0.9.x and ROCm June 28, 2025 | Accelerated LLM Inference on AMD Instinct™ GPUs with vLLM 0.9.x and ROCm June 28, 2025 |
| Alex Voicu | 1 | C++17 parallel algorithms and HIPSTDPAR # April 18, 2024 | C++17 parallel algorithms and HIPSTDPAR # April 18, 2024 |
| Paul Mullowney | 1 | Sparse matrix vector multiplication - part 1 November 03, 2023 | Sparse matrix vector multiplication - part 1 November 03, 2023 |