Posts by AMD Brevitas Team
Posts by AMD Quark Team
Posts by AMD Quark team
Posts by Aarne Talman
08 December 2025 - Scaling AI Inference Performance with vLLM on AMD Instinct MI355X GPUs
Posts by Abby O’Neill
Posts by Abhishek Patil
Posts by Adeem Jassani
18 December 2025 - A Step-by-Step Walkthrough of Decentralized LLM Training on AMD GPUs
Posts by Aditya Bhattacharji
16 September 2025 - ROCm 7.0: An AI-Ready Powerhouse for Performance, Efficiency, and Productivity
11 April 2025 - ROCm 6.4: Breaking Barriers in AI, HPC, and Modular GPU Software
Posts by Aditya Kumar Singh
07 March 2025 - Instella-VL-1B: First AMD Vision Language Model
Posts by Akash Haridas
Posts by Akhila Yeruva
Posts by Akshay Viswanathan
Posts by Albin Toft
03 December 2025 - HPC Coding Agent - Part 1: Combining GLM-powered Cline and RAG Using MCP
25 November 2025 - Using Reinforcement Learning to Fix Text in AI-Generated Videos
24 September 2025 - A Simple Design for Serving Video Generation Models with Distributed Inference
19 August 2025 - Running ComfyUI on AMD Instinct
Posts by Alessandro Fanfarillo
23 October 2025 - Performance Profiling on AMD GPUs - Part 3: Advanced Usage
13 August 2025 - Performance Profiling on AMD GPUs – Part 2: Basic Usage
26 June 2025 - Performance Profiling on AMD GPUs – Part 1: Foundations
18 April 2024 - C++17 parallel algorithms and HIPSTDPAR
17 May 2023 - Register pressure in AMD CDNA™2 GPUs
Posts by Alex Bogdan
23 October 2025 - STX-B0T: Real-time AI Robot Assistant Powered by RyzenAI and ROCm
Posts by Alex He
12 March 2025 - AMD Advances Enterprise AI Through OPEA Integration
13 February 2025 - Navigating vLLM Inference with ROCm and Kubernetes
Posts by Alex Saliniemi
Posts by Alex Voicu
18 April 2024 - C++17 parallel algorithms and HIPSTDPAR
Posts by Alexander Aurell
11 February 2026 - Solution Blueprints: Accelerating AI Deployment with AMD Enterprise AI
Posts by Alexander Finn
17 November 2025 - AMD Inference Microservice (AIM): Production Ready Inference on AMD Instinct™ GPUs
17 November 2025 - AMD Enterprise AI Suite: Open Infrastructure for Production AI
Posts by Alireza Sariaslani
Posts by Amanzhol Salykov
10 March 2026 - FP8 GEMM Optimization on AMD CDNA™4 Architecture
30 September 2025 - Matrix Core Programming on AMD CDNA™3 and CDNA™4 architecture
Posts by Ammar Elwazir
Posts by Andrew Ma
Posts by Andrey Ivannikov
Posts by Andy Allred
Posts by Andy Luo
10 March 2026 - FP8 GEMM Optimization on AMD CDNA™4 Architecture
21 January 2026 - ROCm Becomes a First-Class Platform in the vLLM Ecosystem
02 January 2026 - Accelerating Multimodal Inference in vLLM: The One-Line Optimization for Large Multimodal Models
16 December 2025 - MoE Training Best Practices on AMD GPUs
12 November 2025 - Practical, Fault‑Robust Distributed Inference for DeepSeek on AMD MI300X
04 November 2025 - Stability at Scale: AMD’s Full‑Stack Platform for Large‑Model Training
30 September 2025 - Matrix Core Programming on AMD CDNA™3 and CDNA™4 architecture
19 September 2025 - An Introduction to Primus-Turbo: A Library for Accelerating Transformer Models on AMD GPUs
11 September 2025 - Efficient LLM Serving with MTP: DeepSeek V3 and SGLang on AMD Instinct GPUs
28 August 2025 - Unleashing AMD Instinct™ MI300X GPUs for LLM Serving: Disaggregating Prefill & Decode with SGLang
05 August 2025 - Day 0 Developer Guide: Running the Latest Open Models from OpenAI on AMD AI Hardware
21 March 2025 - Supercharge DeepSeek-R1 Inference on AMD Instinct MI300X
21 February 2025 - Unlock DeepSeek-R1 Inference Performance on AMD Instinct™ MI300X GPU
Posts by Andy Ye
Posts by Angela Wang
Posts by Anik Chaudhuri
Posts by Anshu Raina
Posts by Anshul Gupta
22 January 2026 - ROCm 7.2: Smarter, Faster, and More Scalable for Modern AI Workloads
05 November 2025 - Continuing the Momentum: Refining ROCm For The Next Wave Of AI and HPC
11 September 2025 - Efficient LLM Serving with MTP: DeepSeek V3 and SGLang on AMD Instinct GPUs
05 August 2025 - Day 0 Developer Guide: Running the Latest Open Models from OpenAI on AMD AI Hardware
06 May 2025 - Unleash Full GPU Potential: Overlap Communication and Computation with Triton-Distributed
21 March 2025 - AITER: AI Tensor Engine For ROCm
14 March 2025 - Deploying Google’s Gemma 3 Model with vLLM on AMD Instinct™ MI300X GPUs: A Step-by-Step Guide
13 March 2025 - Optimized ROCm Docker for Distributed AI Training
06 February 2025 - GEMM Kernel Optimization For AMD GPUs
Posts by Anton Smirnov
16 April 2024 - Programming AMD GPUs with Julia
Posts by Antti Virtanen
Posts by Antti-Ville Suni
Posts by Anuya Welling
05 December 2025 - DGL in Depth: SE(3)-Transformer on ROCm 7
02 October 2025 - From Ingestion to Inference: RAG Pipelines on AMD GPUs
20 August 2025 - DGL in the Real World: Running GNNs on Real Use Cases
Posts by Aravind Kumar Rao Bappanadu
Posts by Arttu Niemela
06 March 2026 - HPC Coding Agent - Part 3: MCP Tool for Profiling
25 November 2025 - Using Reinforcement Learning to Fix Text in AI-Generated Videos
Posts by Ashish Sirasao
25 March 2026 - Programming Tensor Descriptors in Composable Kernel (CK)
24 March 2026 - Engineering Qwen-VL for Production: Vision Module Architecture and Optimization Practices
19 March 2026 - hipBLASLt Online GEMM Tuning
17 February 2026 - Advanced MXFP4 Quantization: Combining Fine-Tuned Rotations with SmoothQuant for Near-Lossless Compression
05 November 2025 - Day 0 Developer Guide: hipBLASLt Offline GEMM Tuning Script
29 October 2025 - High-Accuracy MXFP4, MXFP6, and Mixed-Precision Models on AMD GPUs
26 August 2025 - QuickReduce: Up to 3x Faster All-reduce for vLLM and SGLang
Posts by Asitav Mishra
23 October 2025 - Performance Profiling on AMD GPUs - Part 3: Advanced Usage
13 August 2025 - Performance Profiling on AMD GPUs – Part 2: Basic Usage
26 June 2025 - Performance Profiling on AMD GPUs – Part 1: Foundations
13 May 2024 - Reading AMD GPU ISA
15 September 2023 - Jacobi Solver with HIP and OpenMP offloading
Posts by Babak Poursartip
28 February 2025 - Measuring Max-Achievable FLOPs – Part 2
Posts by Baiqiang Xia
10 November 2025 - Training AI Weather Forecasting Models on AMD Instinct
18 September 2025 - Running SOTA AI-based Weather Forecasting models on AMD Instinct
Posts by Balazs Toth
Posts by Ben Sander
28 February 2025 - Measuring Max-Achievable FLOPs – Part 2
14 February 2025 - Understanding Peak, Max-Achievable & Delivered FLOPs, Part 1
Posts by Benran Hu
Posts by Bill He
12 November 2025 - Practical, Fault‑Robust Distributed Inference for DeepSeek on AMD MI300X
Posts by Bin Ding
23 December 2025 - GEAK HIP: Expanding GEAK for HIP Code Optimization
08 December 2025 - Accelerating Autonomous Driving Model Training on AMD ROCm™ Software
01 August 2025 - GEAK: Introducing Triton Kernel AI Agent & Evaluation Benchmarks
Posts by Bishwo Adhikari
08 December 2025 - Scaling AI Inference Performance with vLLM on AMD Instinct MI355X GPUs
Posts by Bo Zhang
Posts by Bob Robey
26 April 2024 - Application portability with HIP
16 April 2024 - Affinity part 2 - System topology and controlling affinity
16 April 2024 - Affinity part 1 - Affinity, placement, and order
Posts by Bobo Fang
30 January 2026 - Debugging NaN Results in CK Tile GEMM: A rocgdb Detective Story
Posts by Bowen Bao
17 February 2026 - Advanced MXFP4 Quantization: Combining Fine-Tuned Rotations with SmoothQuant for Near-Lossless Compression
29 October 2025 - High-Accuracy MXFP4, MXFP6, and Mixed-Precision Models on AMD GPUs
09 September 2025 - Technical Dive into AMD’s MLPerf Inference v5.1 Submission
Posts by Brayden Mahdavi
17 November 2025 - AMD Enterprise AI Suite: Open Infrastructure for Production AI
Posts by Brian Cornille
13 November 2024 - Introducing AMD’s Next-Gen Fortran Compiler
Posts by Brian Pickrell
08 January 2025 - Triton Inference Server with vLLM on AMD GPUs
Posts by Bruce Xue
Posts by Carlus Huang
20 February 2026 - FlyDSL: Expert GPU Kernel Development with the Ease of MLIR Python Native DSL on AMD GPUs
17 February 2026 - Adaptive Top-K Selection: Eliminating Performance Cliffs Across All K Values on AMD GPUs
12 November 2025 - Practical, Fault‑Robust Distributed Inference for DeepSeek on AMD MI300X
30 September 2025 - Matrix Core Programming on AMD CDNA™3 and CDNA™4 architecture
21 March 2025 - AITER: AI Tensor Engine For ROCm
Posts by Carson Liao
17 February 2026 - Unlocking Sparse Acceleration on AMD GPUs with hipSPARSELt
05 November 2025 - Day 0 Developer Guide: hipBLASLt Offline GEMM Tuning Script
09 October 2025 - GEMM Tuning within hipBLASLt– Part 2
05 September 2025 - GEMM Tuning within hipBLASLt - Part 1
Posts by Chaitanya Manem
Posts by Chandan Sharma
Posts by Chandra Yang
Posts by Chang Liu
11 September 2025 - Efficient LLM Serving with MTP: DeepSeek V3 and SGLang on AMD Instinct GPUs
24 March 2025 - Speculative Decoding - Deep Dive
Posts by Chao Li
19 March 2026 - hipBLASLt Online GEMM Tuning
05 November 2025 - Day 0 Developer Guide: hipBLASLt Offline GEMM Tuning Script
Posts by Chao Xu
23 December 2025 - GEAK HIP: Expanding GEAK for HIP Code Optimization
01 August 2025 - GEAK: Introducing Triton Kernel AI Agent & Evaluation Benchmarks
Posts by Chaojun Hou
23 February 2026 - Primus-Pipeline: A More Flexible and Scalable Pipeline Parallelism Implementation
16 December 2025 - MoE Training Best Practices on AMD GPUs
04 November 2025 - Stability at Scale: AMD’s Full‑Stack Platform for Large‑Model Training
Posts by Charles Boyd
19 February 2026 - Introducing hipThreads: A C++ - Style Concurrency Library for AMD GPUs
Posts by Charles Yang
06 October 2025 - Optimizing FP4 Mixed-Precision Inference with Petit on AMD Instinct MI250 and MI300 GPUs: A Developer’s Perspective
Posts by Chelsea Iluno
09 September 2025 - Reproducing the AMD Instinct™ GPUs MLPerf Inference v5.1 Submission
Posts by Cheng Ling
Posts by Cheng Yao
23 February 2026 - Primus-Pipeline: A More Flexible and Scalable Pipeline Parallelism Implementation
16 December 2025 - MoE Training Best Practices on AMD GPUs
Posts by Chia Hung
09 October 2025 - GEMM Tuning within hipBLASLt– Part 2
Posts by Chris Sosa
20 October 2025 - ROCm 7.9 Technology Preview: ROCm Core SDK and TheRock Build System
Posts by Christophe Paquot
28 May 2025 - HIP 7.0 Is Coming: What You Need to Know to Stay Ahead
Posts by Chun Fang
08 December 2025 - Scaling AI Inference Performance with vLLM on AMD Instinct MI355X GPUs
Posts by Chunhung Wang
17 February 2026 - Adaptive Top-K Selection: Eliminating Performance Cliffs Across All K Values on AMD GPUs
30 January 2026 - Debugging NaN Results in CK Tile GEMM: A rocgdb Detective Story
05 November 2025 - Day 0 Developer Guide: hipBLASLt Offline GEMM Tuning Script
Posts by Claire Lee
02 March 2026 - Streamlining Recommendation Model Training on AMD Instinct™ GPUs
Posts by Clement Lin
17 February 2026 - Adaptive Top-K Selection: Eliminating Performance Cliffs Across All K Values on AMD GPUs
30 January 2026 - Debugging NaN Results in CK Tile GEMM: A rocgdb Detective Story
Posts by Clint Greene
01 October 2025 - Enabling FlashInfer on ROCm for Accelerated LLM Serving
09 September 2025 - Technical Dive into AMD’s MLPerf Inference v5.1 Submission
09 September 2025 - Slim Down Your Llama: Pruning & Fine-Tuning for Maximum Performance
09 September 2025 - Reproducing the AMD Instinct™ GPUs MLPerf Inference v5.1 Submission
11 July 2025 - Accelerating Video Generation on ROCm with Unified Sequence Parallelism: A Practical Guide
12 June 2025 - Aligning Mixtral 8x7B with TRL on AMD GPUs
09 October 2024 - Supercharging JAX with Triton Kernels on AMD GPUs
23 September 2024 - Fine-tuning Llama 3 with Axolotl using ROCm on AMD GPUs
19 September 2024 - Inferencing and serving with vLLM on AMD GPUs
19 September 2024 - Enhancing vLLM Inference on AMD GPUs
01 May 2024 - Inferencing with Mixtral 8x22B on AMD GPUs
16 April 2024 - Speech-to-Text on an AMD GPU with Whisper
15 April 2024 - Developing Triton Kernels on AMD GPUs
04 April 2024 - Retrieval Augmented Generation (RAG) using LlamaIndex
26 January 2024 - Accelerating XGBoost with Dask using multiple AMD GPUs
Posts by Corbin Robeck
13 May 2024 - Reading AMD GPU ISA
Posts by Dai Yan
Posts by Daniel Gustafsson
02 April 2026 - Deploy and Customize AMD Solution Blueprints
24 February 2026 - Getting Started with AMD Resource Manager: Efficient Sharing of AMD Instinct™ GPUs for R&D Teams and AI Practitioners
19 December 2025 - Getting Started with AMD AI Workbench: Deploying and Managing AI Workloads
Posts by Daniel Huang
09 March 2026 - Getting Started with ComfyUI on AMD Radeon™ RX 9000 Series GPUs
20 January 2026 - Quickly Developing Powerful Flash Attention Using TileLang on AMD Instinct MI300X GPU
25 August 2025 - AITER-Enabled MLA Layer Inference on AMD Instinct MI300X GPUs
Posts by Daniel Mcintosh
19 February 2026 - Introducing hipThreads: A C++ - Style Concurrency Library for AMD GPUs
Posts by Daniel Velicka
14 November 2022 - AMD matrix cores
Posts by Daniel Warna
10 November 2025 - Training AI Weather Forecasting Models on AMD Instinct
18 September 2025 - Running SOTA AI-based Weather Forecasting models on AMD Instinct
Posts by Danny Guan
16 September 2025 - ROCm 7.0: An AI-Ready Powerhouse for Performance, Efficiency, and Productivity
11 April 2025 - ROCm Gets Modular: Meet the Instinct Datacenter GPU Driver
Posts by David Björelind
24 March 2026 - GROMACS on AMD Instinct GPUs: A Complete Build Guide
13 March 2026 - GROMACS Performance on AMD Instinct MI355X
14 January 2026 - Applying Compute Partitioning for Workloads on MI300X GPUs
07 October 2025 - Medical Imaging on MI300X: Optimized SwinUNETR for Tumor Detection
Posts by David Doscher
26 January 2023 - AMD ROCm™ installation
Posts by David Li
Posts by David Prescott
Posts by David Silverstone
22 January 2026 - LLM Inference Optimization Using AMD GPU Partitioning
Posts by Debasis Mandal
01 October 2025 - Enabling FlashInfer on ROCm for Accelerated LLM Serving
Posts by Deeksha Goplani
03 October 2025 - Elevating 3D Scene Rendering with GSplat
Posts by Deepan Sekar
11 December 2025 - Accelerating llama.cpp on AMD Instinct MI300X
09 September 2025 - Llama.cpp Meets Instinct: A New Era of Open-Source AI Acceleration
Posts by Denny Iriawan
28 May 2025 - HIP 7.0 Is Coming: What You Need to Know to Stay Ahead
Posts by Deval Shah
Posts by Dewei Wang
Posts by Di Tian
12 November 2025 - Practical, Fault‑Robust Distributed Inference for DeepSeek on AMD MI300X
Posts by Diptorup Deb
01 October 2025 - Enabling FlashInfer on ROCm for Accelerated LLM Serving
Posts by Dominic Widdows
06 February 2026 - Accelerating Graph Layout with AI and ROCm on AMD GPUs
20 October 2025 - ROCm 7.9 Technology Preview: ROCm Core SDK and TheRock Build System
07 October 2025 - Announcing MONAI 1.0.0 for AMD ROCm: Breakthrough AI Acceleration for Medical Imaging Models on AMD Instinct™ GPUs
24 July 2025 - Benchmarking Reasoning Models: From Tokens to Answers
Posts by Dong Li
22 January 2026 - Nitro-AR: A Compact AR Transformer for High-Quality Image Generation
12 January 2026 - Athena-PRM: Enhancing Multimodal Reasoning with Data-Efficient Process Reward Models
08 January 2026 - Bridging the Last Mile: Deploying Hummingbird-XT for Efficient Video Generation on AMD Consumer-Grade Platforms
07 January 2026 - Breaking the Accuracy-Speed Barrier: How MXFP4/6 Quantization Revolutionizes Image and Video Generation
02 January 2026 - SparK: Query-Aware Unstructured Sparsity with Recoverable KV Cache Channel Pruning
23 December 2025 - GEAK HIP: Expanding GEAK for HIP Code Optimization
08 December 2025 - Accelerating Autonomous Driving Model Training on AMD ROCm™ Software
03 December 2025 - Týr-the-Pruner: Search-based Global Structural Pruning for LLMs
14 October 2025 - Gumiho: A New Paradigm for Speculative Decoding — Earlier Tokens in a Draft Sequence Matter More
09 September 2025 - Technical Dive into AMD’s MLPerf Inference v5.1 Submission
09 September 2025 - Slim Down Your Llama: Pruning & Fine-Tuning for Maximum Performance
09 September 2025 - Reproducing the AMD Instinct™ GPUs MLPerf Inference v5.1 Submission
22 August 2025 - Introducing AMD EVLM: Efficient Vision-Language Models with Parameter-Space Visual Conditioning
03 August 2025 - AMD Hummingbird Image to Video: A Lightweight Feedback-Driven Model for Efficient Image-to-Video Generation
01 August 2025 - GEAK: Introducing Triton Kernel AI Agent & Evaluation Benchmarks
Posts by Dong Zhou
22 January 2026 - Nitro-AR: A Compact AR Transformer for High-Quality Image Generation
Posts by Dong zhou
Posts by Doug Lehr
21 January 2026 - ROCm Becomes a First-Class Platform in the vLLM Ecosystem
26 August 2025 - QuickReduce: Up to 3x Faster All-reduce for vLLM and SGLang
Posts by Douglas Hamilton
22 May 2025 - ROCm Runfile Installer Is Here!
Posts by Douglas Jia
15 October 2024 - Multinode Fine-Tuning of Stable Diffusion XL on AMD GPUs with Hugging Face Accelerate and OCI’s Kubernetes Engine (OKE)
03 October 2024 - Leaner LLM Inference with INT8 Quantization on AMD GPUs using PyTorch
06 September 2024 - Optimize GPT Training: Enabling Mixed Precision Training in JAX using ROCm on AMD GPUs
22 July 2024 - Using statistical methods to reliably compare algorithm performance in large generative AI models with JAX Profiler on AMD GPUs
02 July 2024 - A Guide to Implementing and Training Generative Pre-trained Transformers (GPT) in JAX on AMD GPUs
17 April 2024 - Inferencing with AI2’s OLMo model on AMD GPU
11 April 2024 - GPU Unleashed: Training Reinforcement Learning Agents with Stable Baselines3 on an AMD GPU in Gymnasium Environment
23 February 2024 - Efficient image generation with Stable Diffusion models and ONNX Runtime using AMD GPUs
25 January 2024 - LLM distributed supervised fine-tuning with JAX
Posts by Duyi Wang
12 November 2025 - Practical, Fault‑Robust Distributed Inference for DeepSeek on AMD MI300X
Posts by Ean Garvey
09 September 2025 - Technical Dive into AMD’s MLPerf Inference v5.1 Submission
09 September 2025 - Reproducing the AMD Instinct™ GPUs MLPerf Inference v5.1 Submission
Posts by Eda Zhou
Posts by Eduardo Alvarez
Posts by Elaine Zosa
Posts by Eli Uriegas
Posts by Eliecer Diaz Diaz
02 April 2026 - Deploy and Customize AMD Solution Blueprints
Posts by Eliot Li
01 April 2026 - Reproducing the AMD MLPerf Inference v6.0 Submission Result
01 April 2026 - AMD Instinct™ GPUs MLPerf Inference v6.0 Submission
13 February 2026 - Elevate Your LLM Inference: Autoscaling with Ray, ROCm 7.0.0, and SkyPilot
11 December 2025 - Accelerating llama.cpp on AMD Instinct MI300X
05 December 2025 - DGL in Depth: SE(3)-Transformer on ROCm 7
14 November 2025 - Plug-and-Play CuPy on ROCm: Data Analytics Acceleration Made Simple
13 November 2025 - Accelerating Vector Search: hipVS and hipRAFT on AMD
12 November 2025 - Technical Dive into AMD MLPerf Training v5.1 Submission
12 November 2025 - Reproducing AMD MLPerf Training v5.1 Submission Result
02 October 2025 - From Ingestion to Inference: RAG Pipelines on AMD GPUs
09 September 2025 - Technical Dive into AMD’s MLPerf Inference v5.1 Submission
09 September 2025 - Slim Down Your Llama: Pruning & Fine-Tuning for Maximum Performance
09 September 2025 - Reproducing the AMD Instinct™ GPUs MLPerf Inference v5.1 Submission
09 September 2025 - Llama.cpp Meets Instinct: A New Era of Open-Source AI Acceleration
30 May 2025 - Scale LLM Inference with Multi-Node Infrastructure
08 January 2025 - Triton Inference Server with vLLM on AMD GPUs
28 August 2024 - Benchmarking Machine Learning using ROCm and AMD GPUs: Reproducing Our MLPerf Inference Submission
09 August 2024 - Inferencing with Grok-1 on AMD GPUs
04 April 2024 - Image classification using Vision Transformer with AMD GPUs
01 April 2024 - Scale AI applications with Ray
Posts by Emad Barsoum
20 February 2026 - FlyDSL: Expert GPU Kernel Development with the Ease of MLIR Python Native DSL on AMD GPUs
22 January 2026 - Nitro-AR: A Compact AR Transformer for High-Quality Image Generation
12 January 2026 - Athena-PRM: Enhancing Multimodal Reasoning with Data-Efficient Process Reward Models
08 January 2026 - Bridging the Last Mile: Deploying Hummingbird-XT for Efficient Video Generation on AMD Consumer-Grade Platforms
07 January 2026 - Breaking the Accuracy-Speed Barrier: How MXFP4/6 Quantization Revolutionizes Image and Video Generation
02 January 2026 - SparK: Query-Aware Unstructured Sparsity with Recoverable KV Cache Channel Pruning
23 December 2025 - GEAK HIP: Expanding GEAK for HIP Code Optimization
08 December 2025 - Accelerating Autonomous Driving Model Training on AMD ROCm™ Software
06 December 2025 - Building a State-of-the-Art 32 Billion Reasoning Model with Only Synthetic Data on AMD GPUs
03 December 2025 - Týr-the-Pruner: Search-based Global Structural Pruning for LLMs
21 November 2025 - LuminaSFT: Generating Synthetic Fine-Tuning Data for Small Language Models
14 October 2025 - Gumiho: A New Paradigm for Speculative Decoding — Earlier Tokens in a Draft Sequence Matter More
25 September 2025 - Day-0 Support for the SGLang-Native RL Framework - slime on AMD Instinct™ GPUs
17 September 2025 - AMD-HybridLM: Towards Extremely Efficient Hybrid Language Models
22 August 2025 - Introducing AMD EVLM: Efficient Vision-Language Models with Parameter-Space Visual Conditioning
03 August 2025 - AMD Hummingbird Image to Video: A Lightweight Feedback-Driven Model for Efficient Image-to-Video Generation
01 August 2025 - GEAK: Introducing Triton Kernel AI Agent & Evaluation Benchmarks
15 July 2025 - Instella-T2I: Open-Source Text-to-Image with 1D Tokenizer and 32× Token Reduction on AMD GPUs
28 April 2025 - Beyond Text: Accelerating Multimodal AI Inference with Speculative Decoding on AMD Instinct™ MI300X GPUs
07 March 2025 - Instella-VL-1B: First AMD Vision Language Model
31 January 2025 - Enhancing AI Training with AMD ROCm Software
Posts by Emelie Wahlstrom
04 November 2025 - Retrieval Augmented Generation (RAG) with vLLM, LangChain and Chroma
Posts by Ephrem Wu
Posts by Ethan Yang
23 December 2025 - GEAK HIP: Expanding GEAK for HIP Code Optimization
Posts by Evan Masters
28 February 2025 - Measuring Max-Achievable FLOPs – Part 2
Posts by Eveline Chen
Posts by Fabricio Flores
13 November 2025 - Accelerating Vector Search: hipVS and hipRAFT on AMD
02 October 2025 - From Ingestion to Inference: RAG Pipelines on AMD GPUs
06 May 2025 - CuPy and hipDF on AMD: The Basics and Beyond
19 February 2025 - Fine-tuning Phi-3.5-mini LLM at scale: Harnessing Accelerate and Slurm for multinode training
08 January 2025 - Triton Inference Server with vLLM on AMD GPUs
24 October 2024 - Torchtune on AMD GPUs How-To Guide: Fine-tuning and Scaling LLMs with Multi-GPU Power
29 July 2024 - Optimizing RoBERTa: Fine-Tuning with Mixed Precision on AMD
01 May 2024 - Step-by-Step Guide to Use OpenLLM on AMD GPUs
04 April 2024 - Building semantic search with SentenceTransformers on AMD
Posts by Faisal Azhar
08 December 2025 - Scaling AI Inference Performance with vLLM on AMD Instinct MI355X GPUs
Posts by Fan Wang
08 December 2025 - Accelerating Autonomous Driving Model Training on AMD ROCm™ Software
Posts by Fan Wu
Posts by Farshad Ghodsian
11 April 2025 - ROCm 6.4: Breaking Barriers in AI, HPC, and Modular GPU Software
28 March 2025 - What’s New in the AMD GPU Operator v1.2.0 Release
29 January 2025 - Announcing the AMD GPU Operator and Metrics Exporter
Posts by Felix Li
Posts by Felix Marty
25 March 2026 - Programming Tensor Descriptors in Composable Kernel (CK)
17 February 2026 - Advanced MXFP4 Quantization: Combining Fine-Tuned Rotations with SmoothQuant for Near-Lossless Compression
29 October 2025 - High-Accuracy MXFP4, MXFP6, and Mixed-Precision Models on AMD GPUs
Posts by Frank Wang
Posts by Fulu Li
01 April 2026 - AMD Instinct™ GPUs MLPerf Inference v6.0 Submission
09 September 2025 - Technical Dive into AMD’s MLPerf Inference v5.1 Submission
09 September 2025 - Slim Down Your Llama: Pruning & Fine-Tuning for Maximum Performance
09 September 2025 - Reproducing the AMD Instinct™ GPUs MLPerf Inference v5.1 Submission
Posts by Fuwei Yang
08 December 2025 - Accelerating Autonomous Driving Model Training on AMD ROCm™ Software
Posts by Ganesh Dasika
09 February 2025 - Deep dive into the MI300 compute and memory partition modes
Posts by Garrett Byrd
14 April 2025 - Installing ROCm from source with Spack
Posts by Gene Su
02 March 2026 - Streamlining Recommendation Model Training on AMD Instinct™ GPUs
Posts by Geoffrey C. Martin-Noble
05 December 2025 - DGL in Depth: SE(3)-Transformer on ROCm 7
Posts by George Markomanolis
16 April 2024 - Affinity part 2 - System topology and controlling affinity
16 April 2024 - Affinity part 1 - Affinity, placement, and order
Posts by George Wang
09 March 2026 - Getting Started with ComfyUI on AMD Radeon™ RX 9000 Series GPUs
20 January 2026 - Quickly Developing Powerful Flash Attention Using TileLang on AMD Instinct MI300X GPU
16 October 2025 - Kimi-K2-Instruct: Enhanced Out-of-the-Box Performance on AMD Instinct MI355 Series GPUs
06 October 2025 - Optimizing FP4 Mixed-Precision Inference with Petit on AMD Instinct MI250 and MI300 GPUs: A Developer’s Perspective
04 September 2025 - Step-3 Deployment Simplified: A Day 0 Developer’s Guide on AMD Instinct™ GPUs
25 August 2025 - AITER-Enabled MLA Layer Inference on AMD Instinct MI300X GPUs
05 August 2025 - Day 0 Developer Guide: Running the Latest Open Models from OpenAI on AMD AI Hardware
18 June 2025 - Fine-Tuning LLMs with GRPO on AMD MI300X: Scalable RLHF with Hugging Face TRL and ROCm
06 May 2025 - Unleash Full GPU Potential: Overlap Communication and Computation with Triton-Distributed
06 February 2025 - GEMM Kernel Optimization For AMD GPUs
Posts by Gerardo del Muro Gonzalez
24 March 2026 - GROMACS on AMD Instinct GPUs: A Complete Build Guide
Posts by Giacomo Capodaglio
23 October 2025 - Performance Profiling on AMD GPUs - Part 3: Advanced Usage
13 August 2025 - Performance Profiling on AMD GPUs – Part 2: Basic Usage
26 June 2025 - Performance Profiling on AMD GPUs – Part 1: Foundations
Posts by Gilbert Lee
Posts by Gilbert Lei
12 November 2025 - Practical, Fault‑Robust Distributed Inference for DeepSeek on AMD MI300X
Posts by Gina Sitaraman
23 October 2025 - Performance Profiling on AMD GPUs - Part 3: Advanced Usage
13 August 2025 - Performance Profiling on AMD GPUs – Part 2: Basic Usage
26 June 2025 - Performance Profiling on AMD GPUs – Part 1: Foundations
26 April 2024 - Application portability with HIP
16 April 2024 - Affinity part 2 - System topology and controlling affinity
16 April 2024 - Affinity part 1 - Affinity, placement, and order
12 April 2023 - Introduction to profiling tools for AMD hardware
09 March 2023 - AMD Instinct™ MI200 GPU memory space overview
14 November 2022 - AMD matrix cores
Posts by Giuseppe Franco
Posts by Gowtham Ramesh
21 November 2025 - LuminaSFT: Generating Synthetic Fine-Tuning Data for Small Language Models
25 September 2025 - Day-0 Support for the SGLang-Native RL Framework - slime on AMD Instinct™ GPUs
07 March 2025 - Instella-VL-1B: First AMD Vision Language Model
Posts by Graham Schelle
09 February 2026 - Building Robotics Applications with Ryzen AI and ROS 2
Posts by Grant Pinkert
14 November 2025 - Plug-and-Play CuPy on ROCm: Data Analytics Acceleration Made Simple
Posts by Gregory Shtrasberg
21 January 2026 - ROCm Becomes a First-Class Platform in the vLLM Ecosystem
Posts by Guanchen Li
02 January 2026 - SparK: Query-Aware Unstructured Sparsity with Recoverable KV Cache Channel Pruning
03 December 2025 - Týr-the-Pruner: Search-based Global Structural Pruning for LLMs
09 September 2025 - Technical Dive into AMD’s MLPerf Inference v5.1 Submission
09 September 2025 - Slim Down Your Llama: Pruning & Fine-Tuning for Maximum Performance
09 September 2025 - Reproducing the AMD Instinct™ GPUs MLPerf Inference v5.1 Submission
Posts by Guihong Li
17 September 2025 - AMD-HybridLM: Towards Extremely Efficient Hybrid Language Models
Posts by Gulsum Gudukbay Akbulut
06 January 2026 - ROCm MaxText Testing — Decoupled (Offline) and Cloud-Integrated Modes
06 January 2026 - ROCm Fork of MaxText: Structure and Strategy
Posts by Hai Xiao
21 March 2025 - Supercharge DeepSeek-R1 Inference on AMD Instinct MI300X
Posts by Haishuo Kong
Posts by Han Lin
19 March 2026 - hipBLASLt Online GEMM Tuning
05 November 2025 - Day 0 Developer Guide: hipBLASLt Offline GEMM Tuning Script
Posts by Han Wang
Posts by Hang Yang
25 March 2026 - Programming Tensor Descriptors in Composable Kernel (CK)
Posts by Hao Chen
25 September 2025 - Day-0 Support for the SGLang-Native RL Framework - slime on AMD Instinct™ GPUs
Posts by Haocong Wang
Posts by Haohui Mai
Posts by Haoyang Li
29 October 2025 - High-Accuracy MXFP4, MXFP6, and Mixed-Precision Models on AMD GPUs
26 August 2025 - QuickReduce: Up to 3x Faster All-reduce for vLLM and SGLang
Posts by Hari Nair
Posts by Harry Souris
Posts by He Cui
Posts by Henry Ho
28 February 2025 - Measuring Max-Achievable FLOPs – Part 2
Posts by HongTao Meng
02 March 2026 - Streamlining Recommendation Model Training on AMD Instinct™ GPUs
Posts by Hongxia Yang
24 February 2026 - PyTorch Offline Tuning with TunableOp
20 February 2026 - FlyDSL: Expert GPU Kernel Development with the Ease of MLIR Python Native DSL on AMD GPUs
21 January 2026 - ROCm Becomes a First-Class Platform in the vLLM Ecosystem
02 January 2026 - Accelerating Multimodal Inference in vLLM: The One-Line Optimization for Large Multimodal Models
21 October 2025 - Empowering Developers to Build a Robust PyTorch Ecosystem on AMD ROCm™ with Better Insights and Monitoring
Posts by Hongyi Yao
Posts by Huanxuan Liao
Posts by Huasha Zhao
Posts by Hui Liu
Posts by Hyukjoon Lee
Posts by Hyunji Kim
23 October 2025 - STX-B0T: Real-time AI Robot Assistant Powered by RyzenAI and ROCm
Posts by Ish Kool
03 October 2025 - Elevating 3D Scene Rendering with GSplat
Posts by Jaakko Vainio
08 December 2025 - Scaling AI Inference Performance with vLLM on AMD Instinct MI355X GPUs
Posts by Jagadish Krishnamoorthy
Posts by James E. T. Smith
05 December 2025 - DGL in Depth: SE(3)-Transformer on ROCm 7
20 August 2025 - DGL in the Real World: Running GNNs on Real Use Cases
Posts by Janet Tseng
20 October 2025 - ROCm 7.9 Technology Preview: ROCm Core SDK and TheRock Build System
Posts by Jared Bowden
Posts by Jarkko Lehtiranta
18 March 2026 - Multi-Node Distributed Inference for Diffusion Models with xDiT
Posts by Jassani Adeem
28 June 2024 - Mamba on AMD GPUs with ROCm
Posts by Jayacharan Kolla
11 April 2025 - ROCm 6.4: Breaking Barriers in AI, HPC, and Modular GPU Software
Posts by Jeff Daily
24 February 2026 - PyTorch Offline Tuning with TunableOp
Posts by Jehandad Khan
24 February 2026 - JAX-AITER: Bringing AMD’s Optimized AI Kernels to JAX on ROCm™
06 January 2026 - ROCm MaxText Testing — Decoupled (Offline) and Cloud-Integrated Modes
06 January 2026 - ROCm Fork of MaxText: Structure and Strategy
Posts by Jeremy Arnold
Posts by Jesus Carabano Bravo
01 April 2026 - Reproducing the AMD MLPerf Inference v6.0 Submission Result
01 April 2026 - AMD Instinct™ GPUs MLPerf Inference v6.0 Submission
Posts by Ji Liu
03 December 2025 - Týr-the-Pruner: Search-based Global Structural Pruning for LLMs
Posts by Jiahao Zhou
12 November 2025 - Practical, Fault‑Robust Distributed Inference for DeepSeek on AMD MI300X
Posts by Jiahui Cao
10 March 2026 - FP8 GEMM Optimization on AMD CDNA™4 Architecture
Posts by Jialian Wu
21 November 2025 - LuminaSFT: Generating Synthetic Fine-Tuning Data for Small Language Models
25 September 2025 - Day-0 Support for the SGLang-Native RL Framework - slime on AMD Instinct™ GPUs
15 July 2025 - Instella-T2I: Open-Source Text-to-Image with 1D Tokenizer and 32× Token Reduction on AMD GPUs
07 March 2025 - Instella-VL-1B: First AMD Vision Language Model
Posts by Jiang Liu
21 November 2025 - LuminaSFT: Generating Synthetic Fine-Tuning Data for Small Language Models
25 September 2025 - Day-0 Support for the SGLang-Native RL Framework - slime on AMD Instinct™ GPUs
15 July 2025 - Instella-T2I: Open-Source Text-to-Image with 1D Tokenizer and 32× Token Reduction on AMD GPUs
07 March 2025 - Instella-VL-1B: First AMD Vision Language Model
Posts by Jianghui Wang
23 December 2025 - GEAK HIP: Expanding GEAK for HIP Code Optimization
01 August 2025 - GEAK: Introducing Triton Kernel AI Agent & Evaluation Benchmarks
Posts by Jiangyong Ren
17 February 2026 - Advanced MXFP4 Quantization: Combining Fine-Tuned Rotations with SmoothQuant for Near-Lossless Compression
26 August 2025 - QuickReduce: Up to 3x Faster All-reduce for vLLM and SGLang
Posts by Jin Pan
25 September 2025 - Day-0 Support for the SGLang-Native RL Framework - slime on AMD Instinct™ GPUs
Posts by Jin Tao
08 December 2025 - Scaling AI Inference Performance with vLLM on AMD Instinct MI355X GPUs
Posts by Jin Zhou
24 February 2026 - PyTorch Offline Tuning with TunableOp
Posts by Jingai Yu
22 January 2026 - Nitro-AR: A Compact AR Transformer for High-Quality Image Generation
Posts by Jingxian Wang
Posts by Jinze Li
Posts by Jithun Nair
Posts by Joaquin Rives Gambin
10 December 2025 - Medical Imaging on MI300X: SwinUNETR Inference Optimization
07 October 2025 - Medical Imaging on MI300X: Optimized SwinUNETR for Tumor Detection
Posts by Joe Shajrawi
Posts by Johanna Malinen
18 March 2026 - Multi-Node Distributed Inference for Diffusion Models with xDiT
Posts by Johanna Potyka
13 November 2024 - Introducing AMD’s Next-Gen Fortran Compiler
Posts by Johanna Yang
03 December 2025 - HPC Coding Agent - Part 1: Combining GLM-powered Cline and RAG Using MCP
27 November 2025 - Exploring Gameplay Video Generation with Hunyuan-GameCraft
21 November 2025 - Inference with HunyuanWorld-Voyager on AMD Instinct GPUs
24 September 2025 - Accelerating Audio-Driven Video Generation: WAN2.2-S2V on AMD ROCm
19 August 2025 - All-in-One Video Editing with VACE on AMD Instinct GPUs
Posts by Jonathan Burdge
Posts by Jorge Parada
30 May 2025 - Scale LLM Inference with Multi-Node Infrastructure
Posts by Joseph Schoonover
14 April 2025 - Installing ROCm from source with Spack
Posts by Joshua Lu
09 February 2026 - Digital Twins on AMD: Building Robotic Simulations Using Edge AI PCs
Posts by Jouni Hartikainen
08 December 2025 - Scaling AI Inference Performance with vLLM on AMD Instinct MI355X GPUs
Posts by Jouni Luoma
Posts by Joyce Zhang
Posts by Juho Kerttula
27 November 2025 - Fine-Tune LLMs for Proteins with AMD Enterprise AI Suite
Posts by Juho Vainio
19 December 2025 - Getting Started with AMD AI Workbench: Deploying and Managing AI Workloads
Posts by Julia Jiang
28 May 2025 - HIP 7.0 Is Coming: What You Need to Know to Stay Ahead
Posts by Jun Chen
12 November 2025 - Practical, Fault‑Robust Distributed Inference for DeepSeek on AMD MI300X
Posts by Jun Kang Chow
Posts by Jun Zhao
Posts by Junyan Yang
05 November 2025 - Day 0 Developer Guide: hipBLASLt Offline GEMM Tuning Script
Posts by Justin Chang
09 February 2025 - MI300A - Exploring the APU advantage
13 November 2024 - Introducing AMD’s Next-Gen Fortran Compiler
29 August 2024 - Seismic stencil codes - part 3
29 August 2024 - Seismic stencil codes - part 2
29 August 2024 - Seismic stencil codes - part 1
15 September 2023 - Jacobi Solver with HIP and OpenMP offloading
18 July 2023 - Finite difference method - Laplacian part 4
11 May 2023 - Finite difference method - Laplacian part 3
04 January 2023 - Finite difference method - Laplacian part 2
14 November 2022 - Finite difference method - Laplacian part 1
Posts by Justin Chu
09 February 2026 - Building Robotics Applications with Ryzen AI and ROS 2
23 October 2025 - STX-B0T: Real-time AI Robot Assistant Powered by RyzenAI and ROCm
Posts by Kai Hakala
Posts by Kailash Gogineni
Posts by Kajsa Arnold
Posts by Kang Liu
Posts by Karan Verma
01 April 2026 - Reproducing the AMD MLPerf Inference v6.0 Submission Result
01 April 2026 - AMD Instinct™ GPUs MLPerf Inference v6.0 Submission
12 November 2025 - Technical Dive into AMD MLPerf Training v5.1 Submission
12 November 2025 - Reproducing AMD MLPerf Training v5.1 Submission Result
09 September 2025 - Slim Down Your Llama: Pruning & Fine-Tuning for Maximum Performance
09 September 2025 - Reproducing the AMD Instinct™ GPUs MLPerf Inference v5.1 Submission
Posts by Karthik Kashyap Thatipamula
03 October 2025 - Elevating 3D Scene Rendering with GSplat
Posts by Karthik Sangaiah
09 February 2025 - Deep dive into the MI300 compute and memory partition modes
Posts by Ke Wang
29 October 2025 - High-Accuracy MXFP4, MXFP6, and Mixed-Precision Models on AMD GPUs
26 August 2025 - QuickReduce: Up to 3x Faster All-reduce for vLLM and SGLang
Posts by Keith Anderson
Posts by Kelvin Lui
19 February 2026 - Introducing hipThreads: A C++ - Style Concurrency Library for AMD GPUs
Posts by Ken O’Brien
Posts by Kenny Roche
21 January 2026 - ROCm Becomes a First-Class Platform in the vLLM Ecosystem
Posts by Kerwin Tsai
09 February 2026 - Digital Twins on AMD: Building Robotic Simulations Using Edge AI PCs
Posts by Kevin Chang
Posts by Kevin Joseph
13 November 2025 - Accelerating Vector Search: hipVS and hipRAFT on AMD
Posts by Kristoffer Peyron
27 November 2025 - Exploring Gameplay Video Generation with Hunyuan-GameCraft
21 November 2025 - Inference with HunyuanWorld-Voyager on AMD Instinct GPUs
24 September 2025 - Accelerating Audio-Driven Video Generation: WAN2.2-S2V on AMD ROCm
19 August 2025 - All-in-One Video Editing with VACE on AMD Instinct GPUs
Posts by KuanTing Lin
Posts by Kumar Deepak
09 September 2025 - Technical Dive into AMD’s MLPerf Inference v5.1 Submission
09 September 2025 - Reproducing the AMD Instinct™ GPUs MLPerf Inference v5.1 Submission
Posts by Kyle Wang
Posts by Kyle Zhao
16 December 2025 - MoE Training Best Practices on AMD GPUs
Posts by Lalith Narasimhan
13 November 2025 - Accelerating Vector Search: hipVS and hipRAFT on AMD
Posts by Lei Shao
09 August 2024 - Inferencing with Grok-1 on AMD GPUs
Posts by Lei Wei
04 November 2025 - Stability at Scale: AMD’s Full‑Stack Platform for Large‑Model Training
Posts by Lei Zhang
Posts by Levent Guner
28 November 2025 - VLM Fine-Tuning for Robotics on AMD Enterprise AI Suite
27 November 2025 - Fine-Tune LLMs for Proteins with AMD Enterprise AI Suite
Posts by Liam Berry
05 November 2025 - Continuing the Momentum: Refining ROCm For The Next Wave Of AI and HPC
16 September 2025 - ROCm 7.0: An AI-Ready Powerhouse for Performance, Efficiency, and Productivity
06 June 2025 - The ROCm Revisited Series
06 June 2025 - ROCm Revisited: Getting Started with HIP
22 May 2025 - ROCm Runfile Installer Is Here!
Posts by Lihuan Zhang
23 February 2026 - Primus-Pipeline: A More Flexible and Scalable Pipeline Parallelism Implementation
16 December 2025 - MoE Training Best Practices on AMD GPUs
Posts by Lin Sun
02 October 2025 - From Ingestion to Inference: RAG Pipelines on AMD GPUs
30 September 2025 - Coding Agents on AMD GPUs: Fast LLM Pipelines for Developers
Posts by Lin Zhao
17 February 2026 - Advanced MXFP4 Quantization: Combining Fine-Tuned Rotations with SmoothQuant for Near-Lossless Compression
29 October 2025 - High-Accuracy MXFP4, MXFP6, and Mixed-Precision Models on AMD GPUs
Posts by Lingpeng Jin
12 November 2025 - Practical, Fault‑Robust Distributed Inference for DeepSeek on AMD MI300X
21 March 2025 - AITER: AI Tensor Engine For ROCm
Posts by Liying Li
16 December 2025 - MoE Training Best Practices on AMD GPUs
Posts by Liz Li
23 February 2026 - Primus-Pipeline: A More Flexible and Scalable Pipeline Parallelism Implementation
16 December 2025 - MoE Training Best Practices on AMD GPUs
04 November 2025 - Stability at Scale: AMD’s Full‑Stack Platform for Large‑Model Training
19 September 2025 - An Introduction to Primus-Turbo: A Library for Accelerating Transformer Models on AMD GPUs
21 March 2025 - Supercharge DeepSeek-R1 Inference on AMD Instinct MI300X
21 March 2025 - AITER: AI Tensor Engine For ROCm
Posts by Logan Grado
03 July 2024 - Accelerating models on ROCm using PyTorch TunableOp
09 April 2024 - ResNet for image classification using AMD GPUs
01 April 2024 - Scale AI applications with Ray
29 March 2024 - Automatic mixed precision in PyTorch using AMD GPUs
Posts by Lorri Rao
Posts by Lovisa Borthas
11 February 2026 - Solution Blueprints: Accelerating AI Deployment with AMD Enterprise AI
Posts by Luise Chen
09 August 2024 - Inferencing with Grok-1 on AMD GPUs
Posts by Luka Stanisic
23 October 2025 - Performance Profiling on AMD GPUs - Part 3: Advanced Usage
13 August 2025 - Performance Profiling on AMD GPUs – Part 2: Basic Usage
26 June 2025 - Performance Profiling on AMD GPUs – Part 1: Foundations
Posts by Luka Tsabadze
06 March 2026 - Fine-Tuning AI Surrogate Models for Physics Simulations with Walrus on AMD Instinct GPU Accelerators
10 November 2025 - Training AI Weather Forecasting Models on AMD Instinct
18 September 2025 - Running SOTA AI-based Weather Forecasting models on AMD Instinct
Posts by Mahdi Ghodsi
Posts by Mahdieh Ghazimirsaeed
08 June 2023 - GPU-aware MPI with ROCm
Posts by Manoj Rao
Posts by Marc Dillon
Posts by Marco Grond
03 October 2025 - Elevating 3D Scene Rendering with GSplat
18 July 2025 - Announcing hipCIM: A Cutting-Edge Solution for Accelerated Multidimensional Image Processing
11 April 2025 - ROCm 6.4: Breaking Barriers in AI, HPC, and Modular GPU Software
Posts by Maria Ruiz Varela
26 April 2024 - Application portability with HIP
09 March 2023 - AMD Instinct™ MI200 GPU memory space overview
Posts by Marilyn Basanta
16 September 2025 - ROCm 7.0: An AI-Ready Powerhouse for Performance, Efficiency, and Productivity
Posts by Mario Reiser
Posts by Mark Granroth Wilding
03 October 2025 - Elevating 3D Scene Rendering with GSplat
Posts by Mark van Heeswijk
Posts by Marko Savic
19 February 2026 - Introducing hipThreads: A C++ - Style Concurrency Library for AMD GPUs
Posts by Markus Hartikainen
08 December 2025 - Scaling AI Inference Performance with vLLM on AMD Instinct MI355X GPUs
Posts by Martin Huarte
Posts by Mathias Lehtinen
17 November 2025 - AMD Enterprise AI Suite: Open Infrastructure for Production AI
Posts by Matt Elliott
21 February 2025 - How to Build a vLLM Container for Inference and Benchmarking
29 January 2025 - Announcing the AMD GPU Operator and Metrics Exporter
17 September 2024 - Getting to Know Your GPU: A Deep Dive into AMD SMI
Posts by Matthias Reso
Posts by Matti Varjokallio
08 December 2025 - Scaling AI Inference Performance with vLLM on AMD Instinct MI355X GPUs
Posts by Meena Arunachalam
01 April 2026 - Reproducing the AMD MLPerf Inference v6.0 Submission Result
01 April 2026 - AMD Instinct™ GPUs MLPerf Inference v6.0 Submission
12 November 2025 - Technical Dive into AMD MLPerf Training v5.1 Submission
12 November 2025 - Reproducing AMD MLPerf Training v5.1 Submission Result
09 September 2025 - Technical Dive into AMD’s MLPerf Inference v5.1 Submission
09 September 2025 - Slim Down Your Llama: Pruning & Fine-Tuning for Maximum Performance
09 September 2025 - Reproducing the AMD Instinct™ GPUs MLPerf Inference v5.1 Submission
Posts by Mehdi Rezagholizadeh
17 September 2025 - AMD-HybridLM: Towards Extremely Efficient Hybrid Language Models
Posts by Mehdi Saeedi
Posts by Menghsuan Yang
17 February 2026 - Adaptive Top-K Selection: Eliminating Performance Cliffs Across All K Values on AMD GPUs
30 January 2026 - Debugging NaN Results in CK Tile GEMM: A rocgdb Detective Story
Posts by Mengmeng Ge
Posts by Michael Klemm
13 November 2024 - Introducing AMD’s Next-Gen Fortran Compiler
Posts by Michael Zhang
13 November 2024 - SGLang: Fast Serving Framework for Large Language and Vision-Language Models on AMD Instinct GPUs
24 October 2024 - CTranslate2: Efficient Inference with Transformer Models on AMD GPUs
Posts by Miikael Leskinen
Posts by Mika Koistinen
Posts by Mika Ranta
Posts by Mikko Lauri
01 April 2026 - Reproducing the AMD MLPerf Inference v6.0 Submission Result
01 April 2026 - AMD Instinct™ GPUs MLPerf Inference v6.0 Submission
Posts by Mikko Tukiainen
08 December 2025 - Scaling AI Inference Performance with vLLM on AMD Instinct MI355X GPUs
Posts by Mikko Vilenius
Posts by Mingjie Lu
08 December 2025 - Accelerating Autonomous Driving Model Training on AMD ROCm™ Software
Posts by Mingyu Yang
17 September 2025 - AMD-HybridLM: Towards Extremely Efficient Hybrid Language Models
Posts by Mingzhi Liu
12 November 2025 - Practical, Fault‑Robust Distributed Inference for DeepSeek on AMD MI300X
Posts by Miro Hodak
01 April 2026 - Reproducing the AMD MLPerf Inference v6.0 Submission Result
01 April 2026 - AMD Instinct™ GPUs MLPerf Inference v6.0 Submission
12 November 2025 - Technical Dive into AMD MLPerf Training v5.1 Submission
12 November 2025 - Reproducing AMD MLPerf Training v5.1 Submission Result
09 September 2025 - Technical Dive into AMD’s MLPerf Inference v5.1 Submission
09 September 2025 - Slim Down Your Llama: Pruning & Fine-Tuning for Maximum Performance
09 September 2025 - Reproducing the AMD Instinct™ GPUs MLPerf Inference v5.1 Submission
Posts by Mittul Singh
Posts by Mohammad Abdul Basit
18 December 2025 - A Step-by-Step Walkthrough of Decentralized LLM Training on AMD GPUs
Posts by Mohammad Mahdi Kamani
Posts by Mohammed Faraaz Mustafa
16 September 2025 - ROCm 7.0: An AI-Ready Powerhouse for Performance, Efficiency, and Productivity
10 June 2025 - AMD ROCm: Powering the World’s Fastest Supercomputers
06 June 2025 - The ROCm Revisited Series
06 June 2025 - ROCm Revisited: Getting Started with HIP
Posts by Mohit Deopujari
13 February 2026 - Elevate Your LLM Inference: Autoscaling with Ray, ROCm 7.0.0, and SkyPilot
22 January 2026 - LLM Inference Optimization Using AMD GPU Partitioning
Posts by Moskvichev Arseny
28 June 2024 - Mamba on AMD GPUs with ROCm
Posts by Mou Li
16 December 2025 - MoE Training Best Practices on AMD GPUs
Posts by Muhammad Osama
09 February 2025 - Deep dive into the MI300 compute and memory partition modes
29 July 2024 - Graph analytics on AMD GPUs using Gunrock
Posts by Mukhil Azhagan Mallaiyan Sathiaseelan
05 December 2025 - DGL in Depth: SE(3)-Transformer on ROCm 7
01 October 2025 - Enabling FlashInfer on ROCm for Accelerated LLM Serving
20 August 2025 - DGL in the Real World: Running GNNs on Real Use Cases
Posts by Mustafa Khalid Masood
08 December 2025 - Scaling AI Inference Performance with vLLM on AMD Instinct MI355X GPUs
Posts by Neha Mathews
01 April 2026 - AMD Instinct™ GPUs MLPerf Inference v6.0 Submission
09 September 2025 - Technical Dive into AMD’s MLPerf Inference v5.1 Submission
09 September 2025 - Reproducing the AMD Instinct™ GPUs MLPerf Inference v5.1 Submission
Posts by Nhat Vo
Posts by Nicholas Curtis
17 May 2023 - Register pressure in AMD CDNA™2 GPUs
Posts by Nicholas Malaya
14 November 2022 - AMD matrix cores
Posts by Nick Romero
24 February 2026 - PyTorch Offline Tuning with TunableOp
Posts by Nico Holmberg
01 April 2026 - Reproducing the AMD MLPerf Inference v6.0 Submission Result
01 April 2026 - AMD Instinct™ GPUs MLPerf Inference v6.0 Submission
08 December 2025 - Scaling AI Inference Performance with vLLM on AMD Instinct MI355X GPUs
Posts by Nicola Tan
17 November 2025 - AMD Enterprise AI Suite: Open Infrastructure for Production AI
Posts by Niels Zhang
09 September 2025 - Technical Dive into AMD’s MLPerf Inference v5.1 Submission
Posts by Niko Ma
12 November 2025 - Practical, Fault‑Robust Distributed Inference for DeepSeek on AMD MI300X
Posts by Niko Vuokko
28 November 2025 - VLM Fine-Tuning for Robotics on AMD Enterprise AI Suite
Posts by Niles Burbank
Posts by Ning Zhang
04 September 2025 - Step-3 Deployment Simplified: A Day 0 Developer’s Guide on AMD Instinct™ GPUs
06 February 2025 - GEMM Kernel Optimization For AMD GPUs
Posts by Nitish Bhat
Posts by No author
Posts by Noah Monti
Posts by Noah Wolfe
12 April 2023 - Introduction to profiling tools for AMD hardware
Posts by Nowy Condro
Posts by Olga Miroshnichenko
08 December 2025 - Scaling AI Inference Performance with vLLM on AMD Instinct MI355X GPUs
Posts by Olha Shkaravska
08 December 2025 - Scaling AI Inference Performance with vLLM on AMD Instinct MI355X GPUs
Posts by Ossian O’Reilly
29 August 2024 - Seismic stencil codes - part 3
29 August 2024 - Seismic stencil codes - part 2
29 August 2024 - Seismic stencil codes - part 1
11 May 2023 - Finite difference method - Laplacian part 3
04 January 2023 - Finite difference method - Laplacian part 2
14 November 2022 - Finite difference method - Laplacian part 1
14 November 2022 - AMD matrix cores
Posts by Parsa Fashi
Posts by Paul Bauer
13 March 2026 - GROMACS Performance on AMD Instinct MI355X
Posts by Paul Hartke
13 November 2025 - Democratizing AI Compute with AMD Using SkyPilot
Posts by Paul Mullowney
03 November 2023 - Sparse matrix vector multiplication - part 1
Posts by Pauli Pihajoki
07 January 2026 - High-Resolution Weather Forecasting with StormCast on AMD Instinct GPU Accelerators
10 November 2025 - Training AI Weather Forecasting Models on AMD Instinct
18 September 2025 - Running SOTA AI-based Weather Forecasting models on AMD Instinct
Posts by Pedram Alizadeh
Posts by Pei Zhang
11 December 2025 - Accelerating llama.cpp on AMD Instinct MI300X
09 September 2025 - Llama.cpp Meets Instinct: A New Era of Open-Source AI Acceleration
Posts by Peng Sun
20 February 2026 - FlyDSL: Expert GPU Kernel Development with the Ease of MLIR Python Native DSL on AMD GPUs
21 January 2026 - ROCm Becomes a First-Class Platform in the vLLM Ecosystem
02 January 2026 - Accelerating Multimodal Inference in vLLM: The One-Line Optimization for Large Multimodal Models
12 November 2025 - Practical, Fault‑Robust Distributed Inference for DeepSeek on AMD MI300X
21 October 2025 - Empowering Developers to Build a Robust PyTorch Ecosystem on AMD ROCm™ with Better Insights and Monitoring
30 September 2025 - Matrix Core Programming on AMD CDNA™3 and CDNA™4 architecture
06 May 2025 - Unleash Full GPU Potential: Overlap Communication and Computation with Triton-Distributed
21 March 2025 - Supercharge DeepSeek-R1 Inference on AMD Instinct MI300X
Posts by Phani Vaddadi
27 February 2026 - Exploring Use Cases for Scalable AI: Implementing Ray with ROCm 7 Support for Efficient ML Workflows
13 February 2026 - Elevate Your LLM Inference: Autoscaling with Ray, ROCm 7.0.0, and SkyPilot
11 December 2025 - Accelerating llama.cpp on AMD Instinct MI300X
05 December 2025 - DGL in Depth: SE(3)-Transformer on ROCm 7
04 December 2025 - Modernizing Taichi Lang to LLVM 20 for MI355X GPU Acceleration
13 November 2025 - Accelerating Vector Search: hipVS and hipRAFT on AMD
07 October 2025 - Announcing MONAI 1.0.0 for AMD ROCm: Breakthrough AI Acceleration for Medical Imaging Models on AMD Instinct™ GPUs
03 October 2025 - Elevating 3D Scene Rendering with GSplat
02 October 2025 - From Ingestion to Inference: RAG Pipelines on AMD GPUs
01 October 2025 - Enabling FlashInfer on ROCm for Accelerated LLM Serving
30 September 2025 - Coding Agents on AMD GPUs: Fast LLM Pipelines for Developers
10 September 2025 - Exploring Use Cases for Scalable AI: Implementing Ray with ROCm Support for Efficient ML Workflows
09 September 2025 - Llama.cpp Meets Instinct: A New Era of Open-Source AI Acceleration
20 August 2025 - DGL in the Real World: Running GNNs on Real Use Cases
Posts by Phillip Dang
11 July 2024 - DBRX Instruct on AMD GPUs
28 June 2024 - Deep Learning Recommendation Models on AMD GPUs
26 April 2024 - Table Question-Answering with TaPas
26 April 2024 - Multimodal (Visual and Language) understanding with LLaVA-NeXT
16 April 2024 - Text Summarization with FLAN-T5
16 April 2024 - Program Synthesis with CodeGen
08 April 2024 - Small language models with Phi-2
04 April 2024 - Using the ChatGLM-6B bilingual language model with AMD GPUs
12 March 2024 - Building a decoder transformer model on AMD GPU(s)
11 March 2024 - Question-answering Chatbot with LangChain on an AMD GPU
08 March 2024 - Music Generation With MusicGen on an AMD GPU
08 February 2024 - Simplifying deep learning: A guide to PyTorch Lightning
Posts by Pier Luigi Dovesi
03 October 2025 - Elevating 3D Scene Rendering with GSplat
Posts by Pin Siang Tan
21 January 2026 - ROCm Becomes a First-Class Platform in the vLLM Ecosystem
Posts by Poovaiah Palangappa
01 April 2026 - Reproducing the AMD MLPerf Inference v6.0 Submission Result
01 April 2026 - AMD Instinct™ GPUs MLPerf Inference v6.0 Submission
09 September 2025 - Technical Dive into AMD’s MLPerf Inference v5.1 Submission
09 September 2025 - Slim Down Your Llama: Pruning & Fine-Tuning for Maximum Performance
09 September 2025 - Reproducing the AMD Instinct™ GPUs MLPerf Inference v5.1 Submission
Posts by Prakamya Mishra
06 December 2025 - Building a State-of-the-Art 32 Billion Reasoning Model with Only Synthetic Data on AMD GPUs
21 November 2025 - LuminaSFT: Generating Synthetic Fine-Tuning Data for Small Language Models
07 March 2025 - Instella-VL-1B: First AMD Vision Language Model
Posts by Pratik Mishra
13 February 2026 - Elevate Your LLM Inference: Autoscaling with Ray, ROCm 7.0.0, and SkyPilot
13 November 2025 - Democratizing AI Compute with AMD Using SkyPilot
Posts by Pratik Prabhanjan Brahma
23 December 2025 - GEAK HIP: Expanding GEAK for HIP Code Optimization
06 December 2025 - Building a State-of-the-Art 32 Billion Reasoning Model with Only Synthetic Data on AMD GPUs
01 August 2025 - GEAK: Introducing Triton Kernel AI Agent & Evaluation Benchmarks
07 March 2025 - Instella-VL-1B: First AMD Vision Language Model
Posts by Pruthvi Madugundu
Posts by Qiang Han
Posts by Quentin Anthony
10 December 2024 - Training Transformers and Hybrid models on AMD Instinct MI300X Accelerators
Posts by Rahul Biswas
10 November 2025 - Training AI Weather Forecasting Models on AMD Instinct
18 September 2025 - Running SOTA AI-based Weather Forecasting models on AMD Instinct
Posts by Rajat Arora
15 September 2023 - Jacobi Solver with HIP and OpenMP offloading
11 May 2023 - Finite difference method - Laplacian part 3
09 March 2023 - AMD Instinct™ MI200 GPU memory space overview
04 January 2023 - Finite difference method - Laplacian part 2
14 November 2022 - Finite difference method - Laplacian part 1
Posts by Rajesh Poornachandran
01 April 2026 - Reproducing the AMD MLPerf Inference v6.0 Submission Result
01 April 2026 - AMD Instinct™ GPUs MLPerf Inference v6.0 Submission
09 September 2025 - Technical Dive into AMD’s MLPerf Inference v5.1 Submission
Posts by Rajneesh Bhardwaj
09 February 2025 - Deep dive into the MI300 compute and memory partition modes
Posts by Rasmus Larsson
02 April 2026 - Deploy and Customize AMD Solution Blueprints
24 February 2026 - Getting Started with AMD Resource Manager: Efficient Sharing of AMD Instinct™ GPUs for R&D Teams and AI Practitioners
19 December 2025 - Getting Started with AMD AI Workbench: Deploying and Managing AI Workloads
04 November 2025 - Retrieval Augmented Generation (RAG) with vLLM, LangChain and Chroma
Posts by Rathnakara Malatesha
25 February 2025 - Deploying Serverless AI Inference on AMD GPU Clusters
Posts by Ravi Dwivedula
12 November 2025 - Technical Dive into AMD MLPerf Training v5.1 Submission
12 November 2025 - Reproducing AMD MLPerf Training v5.1 Submission Result
Posts by Rebecca Lee
01 April 2026 - AMD Instinct™ GPUs MLPerf Inference v6.0 Submission
Posts by Rebecca Li
01 April 2026 - Reproducing the AMD MLPerf Inference v6.0 Submission Result
Posts by Reima Karhila
08 December 2025 - Scaling AI Inference Performance with vLLM on AMD Instinct MI355X GPUs
Posts by Rene Van Oostrum
14 November 2022 - AMD matrix cores
Posts by Rishi Madduri
01 October 2025 - Enabling FlashInfer on ROCm for Accelerated LLM Serving
Posts by Robert Talling
Posts by Romil Bhardwaj
13 November 2025 - Democratizing AI Compute with AMD Using SkyPilot
Posts by Ronnie Chatterjee
11 April 2025 - ROCm 6.4: Breaking Barriers in AI, HPC, and Modular GPU Software
Posts by Rony Leppanen
18 March 2026 - Multi-Node Distributed Inference for Diffusion Models with xDiT
Posts by Rui Sampaio
17 November 2025 - AMD Inference Microservice (AIM): Production Ready Inference on AMD Instinct™ GPUs
07 October 2025 - Medical Imaging on MI300X: Optimized SwinUNETR for Tumor Detection
Posts by Ruibin Zhang
16 December 2025 - MoE Training Best Practices on AMD GPUs
Posts by Ruturaj Kiran Vaidya
24 February 2026 - JAX-AITER: Bringing AMD’s Optimized AI Kernels to JAX on ROCm™
Posts by Ryan Swann
09 February 2025 - Deep dive into the MI300 compute and memory partition modes
Posts by Saad Rahim
05 November 2025 - Continuing the Momentum: Refining ROCm For The Next Wave Of AI and HPC
20 October 2025 - ROCm 7.9 Technology Preview: ROCm Core SDK and TheRock Build System
16 September 2025 - ROCm 7.0: An AI-Ready Powerhouse for Performance, Efficiency, and Productivity
10 June 2025 - AMD ROCm: Powering the World’s Fastest Supercomputers
06 June 2025 - The ROCm Revisited Series
06 June 2025 - ROCm Revisited: Getting Started with HIP
28 May 2025 - HIP 7.0 Is Coming: What You Need to Know to Stay Ahead
22 May 2025 - ROCm Runfile Installer Is Here!
11 April 2025 - ROCm Gets Modular: Meet the Instinct Datacenter GPU Driver
11 April 2025 - ROCm 6.4: Breaking Barriers in AI, HPC, and Modular GPU Software
Posts by Sander Bijl de Vroe
Posts by Saptarshi Majumder
23 December 2025 - GEAK HIP: Expanding GEAK for HIP Code Optimization
01 August 2025 - GEAK: Introducing Triton Kernel AI Agent & Evaluation Benchmarks
Posts by Saroosh Shabbir
11 February 2026 - Solution Blueprints: Accelerating AI Deployment with AMD Enterprise AI
04 November 2025 - Retrieval Augmented Generation (RAG) with vLLM, LangChain and Chroma
Posts by Sarthak Arora
12 November 2025 - Technical Dive into AMD MLPerf Training v5.1 Submission
12 November 2025 - Reproducing AMD MLPerf Training v5.1 Submission Result
Posts by Sarthak Tandon
24 February 2026 - PyTorch Offline Tuning with TunableOp
Posts by Sarunas Kalade
09 February 2026 - Building Robotics Applications with Ryzen AI and ROS 2
Posts by Sathish Sanjeevi
12 November 2025 - Technical Dive into AMD MLPerf Training v5.1 Submission
12 November 2025 - Reproducing AMD MLPerf Training v5.1 Submission Result
Posts by Satya Jandhyala
27 February 2026 - Exploring Use Cases for Scalable AI: Implementing Ray with ROCm 7 Support for Efficient ML Workflows
13 February 2026 - Elevate Your LLM Inference: Autoscaling with Ray, ROCm 7.0.0, and SkyPilot
Posts by Scott Todd
20 October 2025 - ROCm 7.9 Technology Preview: ROCm Core SDK and TheRock Build System
Posts by Sean Miller
18 July 2023 - Finite difference method - Laplacian part 4
11 May 2023 - Finite difference method - Laplacian part 3
09 March 2023 - AMD Instinct™ MI200 GPU memory space overview
04 January 2023 - Finite difference method - Laplacian part 2
14 November 2022 - Finite difference method - Laplacian part 1
Posts by Sean Song
22 January 2026 - LLM Inference Optimization Using AMD GPU Partitioning
09 February 2025 - PyTorch Fully Sharded Data Parallel (FSDP) on AMD GPUs with ROCm
24 January 2025 - Vision Mamba on AMD GPU with ROCm
01 November 2024 - Distributed Data Parallel Training on AMD GPU with ROCm
23 October 2024 - Inference with Llama 3.2 Vision LLMs on AMD GPUs Using ROCm
28 June 2024 - Mamba on AMD GPUs with ROCm
04 June 2024 - Segment Anything with AMD GPUs
15 April 2024 - Enhancing LLM Accessibility: A Deep Dive into QLoRA Through Fine-tuning Llama Model on a single AMD GPU
15 April 2024 - Enhancing LLM Accessibility: A Deep Dive into QLoRA Through Fine-tuning Llama 2 on a single AMD GPU
05 February 2024 - Using LoRA for efficient fine-tuning: Fundamental principles
Posts by Sebastian Andersson
17 November 2025 - AMD Enterprise AI Suite: Open Infrastructure for Production AI
Posts by Sebastian Remander
24 March 2026 - GROMACS on AMD Instinct GPUs: A Complete Build Guide
13 March 2026 - GROMACS Performance on AMD Instinct MI355X
Posts by Seungrok Jung
21 March 2025 - Supercharge DeepSeek-R1 Inference on AMD Instinct MI300X
15 March 2024 - Large language model inference optimizations on AMD GPUs
Posts by Shaghayegh Roohi
28 November 2025 - VLM Fine-Tuning for Robotics on AMD Enterprise AI Suite
03 October 2025 - Elevating 3D Scene Rendering with GSplat
Posts by Shashank Kashyap
Posts by Shekhar Pandey
05 August 2025 - Day 0 Developer Guide: Running the Latest Open Models from OpenAI on AMD AI Hardware
21 March 2025 - AITER: AI Tensor Engine For ROCm
Posts by Shenrun Zhang
Posts by Shijie Feng
Posts by Shizhe Ding
22 January 2026 - Nitro-AR: A Compact AR Transformer for High-Quality Image Generation
Posts by Shizhu He
Posts by Shrey Ajmera
Posts by Shubin Zhao
Posts by Simon Mo
21 January 2026 - ROCm Becomes a First-Class Platform in the vLLM Ecosystem
Posts by Sonali Singh
09 February 2025 - Deep dive into the MI300 compute and memory partition modes
Posts by Sonya Yang
09 February 2026 - Digital Twins on AMD: Building Robotic Simulations Using Edge AI PCs
Posts by Sopiko Kurdadze
03 February 2026 - Foundations of Molecular Generation with GP-MoLFormer on AMD Instinct MI300X Accelerators
21 November 2025 - Accelerating AI-Driven Crystalline Materials Design with MatterGen on AMD Instinct MI300X
10 November 2025 - Training AI Weather Forecasting Models on AMD Instinct
24 September 2025 - A Simple Design for Serving Video Generation Models with Distributed Inference
19 August 2025 - Accelerating FastVideo on AMD GPUs with TeaCache
Posts by Soumitra Chatterjee
Posts by Spandan Tiwari
25 March 2026 - Programming Tensor Descriptors in Composable Kernel (CK)
24 March 2026 - Engineering Qwen-VL for Production: Vision Module Architecture and Optimization Practices
19 March 2026 - hipBLASLt Online GEMM Tuning
17 February 2026 - Advanced MXFP4 Quantization: Combining Fine-Tuned Rotations with SmoothQuant for Near-Lossless Compression
05 November 2025 - Day 0 Developer Guide: hipBLASLt Offline GEMM Tuning Script
29 October 2025 - High-Accuracy MXFP4, MXFP6, and Mixed-Precision Models on AMD GPUs
09 September 2025 - Technical Dive into AMD’s MLPerf Inference v5.1 Submission
26 August 2025 - QuickReduce: Up to 3x Faster All-reduce for vLLM and SGLang
Posts by Srinivasan Subramanian
Posts by Sriranjani Ramasubramanian
Posts by Stanislau Fink
Posts by Steve Reinhardt
02 March 2026 - Streamlining Recommendation Model Training on AMD Instinct™ GPUs
Posts by Stig-Arne Gronroos
08 December 2025 - Scaling AI Inference Performance with vLLM on AMD Instinct MI355X GPUs
Posts by Su Ann Chong
12 November 2025 - Technical Dive into AMD MLPerf Training v5.1 Submission
12 November 2025 - Reproducing AMD MLPerf Training v5.1 Submission Result
Posts by Subhajit Dutta Chowdhury
Posts by Sudhanshu Ranjan
21 November 2025 - LuminaSFT: Generating Synthetic Fine-Tuning Data for Small Language Models
07 March 2025 - Instella-VL-1B: First AMD Vision Language Model
Posts by Sujin Philip
13 November 2025 - Accelerating Vector Search: hipVS and hipRAFT on AMD
Posts by Sukriti Choudhary
13 November 2025 - Accelerating Vector Search: hipVS and hipRAFT on AMD
Posts by Sundara Murthy Gurunathan
Posts by Suyash Tandon
09 February 2025 - MI300A - Exploring the APU advantage
26 April 2024 - Application portability with HIP
12 April 2023 - Introduction to profiling tools for AMD hardware
Posts by Takashi Isobe
Posts by Ted Themistokleous
08 January 2025 - Triton Inference Server with vLLM on AMD GPUs
Posts by Teemu Karkkainen
28 November 2025 - VLM Fine-Tuning for Robotics on AMD Enterprise AI Suite
Posts by Teemu Virolainen
08 December 2025 - Scaling AI Inference Performance with vLLM on AMD Instinct MI355X GPUs
Posts by Tero Kemppi
18 March 2026 - Multi-Node Distributed Inference for Diffusion Models with xDiT
Posts by Tharun Adithya Srikrishnan
02 March 2026 - Streamlining Recommendation Model Training on AMD Instinct™ GPUs
Posts by Theresa Shan
Posts by Thomas Bergstrom
Posts by Thomas Gibson
23 October 2025 - Performance Profiling on AMD GPUs - Part 3: Advanced Usage
13 August 2025 - Performance Profiling on AMD GPUs – Part 2: Basic Usage
26 June 2025 - Performance Profiling on AMD GPUs – Part 1: Foundations
29 July 2024 - Graph analytics on AMD GPUs using Gunrock
18 July 2023 - Finite difference method - Laplacian part 4
11 May 2023 - Finite difference method - Laplacian part 3
12 April 2023 - Introduction to profiling tools for AMD hardware
04 January 2023 - Finite difference method - Laplacian part 2
14 November 2022 - Finite difference method - Laplacian part 1
Posts by Tiffany Mintz
04 December 2025 - Modernizing Taichi Lang to LLVM 20 for MI355X GPU Acceleration
08 January 2025 - Triton Inference Server with vLLM on AMD GPUs
Posts by Tong Shen
22 January 2026 - Nitro-AR: A Compact AR Transformer for High-Quality Image Generation
Posts by Treemann Zheng
08 December 2025 - Accelerating Autonomous Driving Model Training on AMD ROCm™ Software
Posts by Tres Popp
08 December 2025 - Scaling AI Inference Performance with vLLM on AMD Instinct MI355X GPUs
05 December 2025 - DGL in Depth: SE(3)-Transformer on ROCm 7
Posts by Tun Jian Tan
21 January 2026 - ROCm Becomes a First-Class Platform in the vLLM Ecosystem
Posts by Tuukka Sarvi
08 December 2025 - Scaling AI Inference Performance with vLLM on AMD Instinct MI355X GPUs
Posts by Uma Kannikanti
01 April 2026 - Reproducing the AMD MLPerf Inference v6.0 Submission Result
01 April 2026 - AMD Instinct™ GPUs MLPerf Inference v6.0 Submission
09 September 2025 - Technical Dive into AMD’s MLPerf Inference v5.1 Submission
09 September 2025 - Reproducing the AMD Instinct™ GPUs MLPerf Inference v5.1 Submission
Posts by Umang Pandey
Posts by Vara Lakshmi Bayanagari
28 January 2025 - Distributed fine-tuning of MPT-30B using Composer on AMD GPUs
03 December 2024 - Transformer based Encoder-Decoder models for image-captioning on AMD GPUs
13 November 2024 - Quantized 8-bit LLM training and inference using bitsandbytes on AMD GPUs
15 October 2024 - Speed Up Text Generation with Speculative Sampling on AMD GPUs
03 September 2024 - Image Classification with BEiT, MobileNet, and EfficientNet using ROCm on AMD GPUs
16 April 2024 - PyTorch C++ Extension on AMD GPU
04 April 2024 - Total body segmentation using MONAI Deploy on an AMD GPU
07 February 2024 - Two-dimensional images to three-dimensional scene mapping using NeRF on an AMD GPU
29 January 2024 - Pre-training BERT using Hugging Face & TensorFlow on an AMD GPU
26 January 2024 - Pre-training BERT using Hugging Face & PyTorch on an AMD GPU
Posts by Vasumathi Neralla
07 October 2025 - Medical Imaging on MI300X: Optimized SwinUNETR for Tumor Detection
Posts by Vicky Tsang
27 February 2026 - Exploring Use Cases for Scalable AI: Implementing Ray with ROCm 7 Support for Efficient ML Workflows
13 February 2026 - Elevate Your LLM Inference: Autoscaling with Ray, ROCm 7.0.0, and SkyPilot
10 September 2025 - Exploring Use Cases for Scalable AI: Implementing Ray with ROCm Support for Efficient ML Workflows
24 April 2025 - Reinforcement Learning from Human Feedback on AMD GPUs with verl and ROCm Integration
01 April 2024 - Scale AI applications with Ray
Posts by Victor Robles
14 February 2025 - AI Inference Orchestration with Kubernetes on Instinct MI300X, Part 2
07 February 2025 - AI Inference Orchestration with Kubernetes on Instinct MI300X, Part 1
Posts by Vidushi Goyal
Posts by Vikas C Sajjan
07 October 2025 - Announcing MONAI 1.0.0 for AMD ROCm: Breakthrough AI Acceleration for Medical Imaging Models on AMD Instinct™ GPUs
03 October 2025 - Elevating 3D Scene Rendering with GSplat
Posts by Vikram Appia
17 September 2025 - AMD-HybridLM: Towards Extremely Efficient Hybrid Language Models
Posts by Vin Huang
17 February 2026 - Unlocking Sparse Acceleration on AMD GPUs with hipSPARSELt
Posts by Vinay Joshi
01 August 2025 - GEAK: Introducing Triton Kernel AI Agent & Evaluation Benchmarks
Posts by Vinayak Gokhale
09 September 2025 - Technical Dive into AMD’s MLPerf Inference v5.1 Submission
Posts by Vish Vadlamani
27 February 2026 - Exploring Use Cases for Scalable AI: Implementing Ray with ROCm 7 Support for Efficient ML Workflows
13 February 2026 - Elevate Your LLM Inference: Autoscaling with Ray, ROCm 7.0.0, and SkyPilot
11 December 2025 - Accelerating llama.cpp on AMD Instinct MI300X
05 December 2025 - DGL in Depth: SE(3)-Transformer on ROCm 7
04 December 2025 - Modernizing Taichi Lang to LLVM 20 for MI355X GPU Acceleration
13 November 2025 - Accelerating Vector Search: hipVS and hipRAFT on AMD
07 October 2025 - Announcing MONAI 1.0.0 for AMD ROCm: Breakthrough AI Acceleration for Medical Imaging Models on AMD Instinct™ GPUs
03 October 2025 - Elevating 3D Scene Rendering with GSplat
02 October 2025 - From Ingestion to Inference: RAG Pipelines on AMD GPUs
01 October 2025 - Enabling FlashInfer on ROCm for Accelerated LLM Serving
30 September 2025 - Coding Agents on AMD GPUs: Fast LLM Pipelines for Developers
10 September 2025 - Exploring Use Cases for Scalable AI: Implementing Ray with ROCm Support for Efficient ML Workflows
09 September 2025 - Llama.cpp Meets Instinct: A New Era of Open-Source AI Acceleration
20 August 2025 - DGL in the Real World: Running GNNs on Real Use Cases
24 April 2025 - Reinforcement Learning from Human Feedback on AMD GPUs with verl and ROCm Integration
08 January 2025 - Triton Inference Server with vLLM on AMD GPUs
Posts by Vivian Cheng
09 February 2026 - Building Robotics Applications with Ryzen AI and ROS 2
23 October 2025 - STX-B0T: Real-time AI Robot Assistant Powered by RyzenAI and ROCm
Posts by Warren Eng
07 August 2025 - Running ComfyUI in Windows with ROCm on WSL
Posts by Wei Cai
Posts by Wei Luo
25 March 2026 - Programming Tensor Descriptors in Composable Kernel (CK)
24 March 2026 - Engineering Qwen-VL for Production: Vision Module Architecture and Optimization Practices
19 March 2026 - hipBLASLt Online GEMM Tuning
17 February 2026 - Advanced MXFP4 Quantization: Combining Fine-Tuned Rotations with SmoothQuant for Near-Lossless Compression
05 November 2025 - Day 0 Developer Guide: hipBLASLt Offline GEMM Tuning Script
29 October 2025 - High-Accuracy MXFP4, MXFP6, and Mixed-Precision Models on AMD GPUs
09 September 2025 - Technical Dive into AMD’s MLPerf Inference v5.1 Submission
26 August 2025 - QuickReduce: Up to 3x Faster All-reduce for vLLM and SGLang
Posts by Wei-Ting Liao
01 April 2026 - AMD Instinct™ GPUs MLPerf Inference v6.0 Submission
09 September 2025 - Technical Dive into AMD’s MLPerf Inference v5.1 Submission
09 September 2025 - Reproducing the AMD Instinct™ GPUs MLPerf Inference v5.1 Submission
Posts by Wen Xie
23 February 2026 - Primus-Pipeline: A More Flexible and Scalable Pipeline Parallelism Implementation
16 December 2025 - MoE Training Best Practices on AMD GPUs
Posts by Wensong Chan
Posts by Wickey Wang
Posts by William Anzen
Posts by Xavier Aguilar Fruto
08 December 2025 - Scaling AI Inference Performance with vLLM on AMD Instinct MI355X GPUs
Posts by Xi Zhao
Posts by Xiaobo Chen
16 December 2025 - MoE Training Best Practices on AMD GPUs
Posts by Xiaodong Yu
21 November 2025 - LuminaSFT: Generating Synthetic Fine-Tuning Data for Small Language Models
25 September 2025 - Day-0 Support for the SGLang-Native RL Framework - slime on AMD Instinct™ GPUs
15 July 2025 - Instella-T2I: Open-Source Text-to-Image with 1D Tokenizer and 32× Token Reduction on AMD GPUs
07 March 2025 - Instella-VL-1B: First AMD Vision Language Model
Posts by Xiaofeng Zheng
Posts by Xiaoming Peng
23 February 2026 - Primus-Pipeline: A More Flexible and Scalable Pipeline Parallelism Implementation
16 December 2025 - MoE Training Best Practices on AMD GPUs
Posts by Ximeng Sun
21 November 2025 - LuminaSFT: Generating Synthetic Fine-Tuning Data for Small Language Models
25 September 2025 - Day-0 Support for the SGLang-Native RL Framework - slime on AMD Instinct™ GPUs
15 July 2025 - Instella-T2I: Open-Source Text-to-Image with 1D Tokenizer and 32× Token Reduction on AMD GPUs
07 March 2025 - Instella-VL-1B: First AMD Vision Language Model
Posts by Xinjun Niu
25 March 2026 - Programming Tensor Descriptors in Composable Kernel (CK)
24 March 2026 - Engineering Qwen-VL for Production: Vision Module Architecture and Optimization Practices
19 March 2026 - hipBLASLt Online GEMM Tuning
17 February 2026 - Advanced MXFP4 Quantization: Combining Fine-Tuned Rotations with SmoothQuant for Near-Lossless Compression
05 November 2025 - Day 0 Developer Guide: hipBLASLt Offline GEMM Tuning Script
29 October 2025 - High-Accuracy MXFP4, MXFP6, and Mixed-Precision Models on AMD GPUs
26 August 2025 - QuickReduce: Up to 3x Faster All-reduce for vLLM and SGLang
Posts by Xuanwu Yin
12 January 2026 - Athena-PRM: Enhancing Multimodal Reasoning with Data-Efficient Process Reward Models
07 January 2026 - Breaking the Accuracy-Speed Barrier: How MXFP4/6 Quantization Revolutionizes Image and Video Generation
02 January 2026 - SparK: Query-Aware Unstructured Sparsity with Recoverable KV Cache Channel Pruning
03 December 2025 - Týr-the-Pruner: Search-based Global Structural Pruning for LLMs
14 October 2025 - Gumiho: A New Paradigm for Speculative Decoding — Earlier Tokens in a Draft Sequence Matter More
09 September 2025 - Technical Dive into AMD’s MLPerf Inference v5.1 Submission
09 September 2025 - Slim Down Your Llama: Pruning & Fine-Tuning for Maximum Performance
09 September 2025 - Reproducing the AMD Instinct™ GPUs MLPerf Inference v5.1 Submission
Posts by Xun Wang
Posts by Yamini Kamisetty
09 September 2025 - Reproducing the AMD Instinct™ GPUs MLPerf Inference v5.1 Submission
Posts by Yamini Preethi Kamisetty
01 April 2026 - Reproducing the AMD MLPerf Inference v6.0 Submission Result
01 April 2026 - AMD Instinct™ GPUs MLPerf Inference v6.0 Submission
Posts by Yan Sun
Posts by YangWen Huang
09 October 2025 - GEMM Tuning within hipBLASLt– Part 2
05 September 2025 - GEMM Tuning within hipBLASLt - Part 1
Posts by Yanyuan Qin
16 December 2025 - MoE Training Best Practices on AMD GPUs
Posts by Yao Fehlis
11 September 2023 - Creating a PyTorch/TensorFlow code environment on AMD GPUs
Posts by Yao Fu
02 March 2026 - Streamlining Recommendation Model Training on AMD Instinct™ GPUs
23 February 2026 - Primus-Pipeline: A More Flexible and Scalable Pipeline Parallelism Implementation
16 December 2025 - MoE Training Best Practices on AMD GPUs
04 November 2025 - Stability at Scale: AMD’s Full‑Stack Platform for Large‑Model Training
19 September 2025 - An Introduction to Primus-Turbo: A Library for Accelerating Transformer Models on AMD GPUs
05 August 2025 - Day 0 Developer Guide: Running the Latest Open Models from OpenAI on AMD AI Hardware
13 March 2025 - Optimized ROCm Docker for Distributed AI Training
Posts by Yao Liu
27 February 2026 - Exploring Use Cases for Scalable AI: Implementing Ray with ROCm 7 Support for Efficient ML Workflows
13 February 2026 - Elevate Your LLM Inference: Autoscaling with Ray, ROCm 7.0.0, and SkyPilot
11 December 2025 - Accelerating llama.cpp on AMD Instinct MI300X
05 December 2025 - DGL in Depth: SE(3)-Transformer on ROCm 7
04 December 2025 - Modernizing Taichi Lang to LLVM 20 for MI355X GPU Acceleration
02 October 2025 - From Ingestion to Inference: RAG Pipelines on AMD GPUs
01 October 2025 - Enabling FlashInfer on ROCm for Accelerated LLM Serving
30 September 2025 - Coding Agents on AMD GPUs: Fast LLM Pipelines for Developers
10 September 2025 - Exploring Use Cases for Scalable AI: Implementing Ray with ROCm Support for Efficient ML Workflows
09 September 2025 - Llama.cpp Meets Instinct: A New Era of Open-Source AI Acceleration
20 August 2025 - DGL in the Real World: Running GNNs on Real Use Cases
24 April 2025 - Reinforcement Learning from Human Feedback on AMD GPUs with verl and ROCm Integration
08 January 2025 - Triton Inference Server with vLLM on AMD GPUs
Posts by Yaoming Mu
Posts by Yayuan Wang
09 March 2026 - Getting Started with ComfyUI on AMD Radeon™ RX 9000 Series GPUs
Posts by Yazhini Rajesh
Posts by Ye Hur Cheong
Posts by Yi Huang
09 March 2026 - Agentic Diagnosis for LLM Training at Scale
Posts by Yineng Zhang
Posts by Yixing Xu
02 January 2026 - SparK: Query-Aware Unstructured Sparsity with Recoverable KV Cache Channel Pruning
03 December 2025 - Týr-the-Pruner: Search-based Global Structural Pruning for LLMs
14 October 2025 - Gumiho: A New Paradigm for Speculative Decoding — Earlier Tokens in a Draft Sequence Matter More
09 September 2025 - Technical Dive into AMD’s MLPerf Inference v5.1 Submission
09 September 2025 - Slim Down Your Llama: Pruning & Fine-Tuning for Maximum Performance
09 September 2025 - Reproducing the AMD Instinct™ GPUs MLPerf Inference v5.1 Submission
Posts by Yosi Hatekar
23 October 2025 - STX-B0T: Real-time AI Robot Assistant Powered by RyzenAI and ROCm
Posts by Yu Geng
Posts by Yu Wang
17 November 2025 - AMD Inference Microservice (AIM): Production Ready Inference on AMD Instinct™ GPUs
17 November 2025 - AMD Enterprise AI Suite: Open Infrastructure for Production AI
12 March 2025 - AMD Advances Enterprise AI Through OPEA Integration
Posts by Yu Zhou
19 March 2026 - hipBLASLt Online GEMM Tuning
Posts by Yuankai Chen
23 February 2026 - Primus-Pipeline: A More Flexible and Scalable Pipeline Parallelism Implementation
16 December 2025 - MoE Training Best Practices on AMD GPUs
Posts by Yuchen Lin
30 January 2026 - Debugging NaN Results in CK Tile GEMM: A rocgdb Detective Story
Posts by Yue Liu
02 March 2026 - Streamlining Recommendation Model Training on AMD Instinct™ GPUs
Posts by Yuhang Song
09 February 2026 - Digital Twins on AMD: Building Robotic Simulations Using Edge AI PCs
Posts by Yusheng Su
21 November 2025 - LuminaSFT: Generating Synthetic Fine-Tuning Data for Small Language Models
25 September 2025 - Day-0 Support for the SGLang-Native RL Framework - slime on AMD Instinct™ GPUs
15 July 2025 - Instella-T2I: Open-Source Text-to-Image with 1D Tokenizer and 32× Token Reduction on AMD GPUs
24 April 2025 - Reinforcement Learning from Human Feedback on AMD GPUs with verl and ROCm Integration
07 March 2025 - Instella-VL-1B: First AMD Vision Language Model
Posts by Yutong Wu
12 November 2025 - Practical, Fault‑Robust Distributed Inference for DeepSeek on AMD MI300X
Posts by Yuvarani Shankar
Posts by Yuzhen Zhou
25 September 2025 - Day-0 Support for the SGLang-Native RL Framework - slime on AMD Instinct™ GPUs
Posts by Yuzhou Lu
09 February 2026 - Digital Twins on AMD: Building Robotic Simulations Using Edge AI PCs
Posts by Ze Wang
21 November 2025 - LuminaSFT: Generating Synthetic Fine-Tuning Data for Small Language Models
25 September 2025 - Day-0 Support for the SGLang-Native RL Framework - slime on AMD Instinct™ GPUs
15 July 2025 - Instella-T2I: Open-Source Text-to-Image with 1D Tokenizer and 32× Token Reduction on AMD GPUs
07 March 2025 - Instella-VL-1B: First AMD Vision Language Model
Posts by Zeping Li
03 December 2025 - Týr-the-Pruner: Search-based Global Structural Pruning for LLMs
Posts by Zhanghao Wu
13 November 2025 - Democratizing AI Compute with AMD Using SkyPilot
Posts by Zhao An
Posts by Zhao Lin
09 September 2025 - Technical Dive into AMD’s MLPerf Inference v5.1 Submission
Posts by Zhaodong Bing
08 December 2025 - Accelerating Autonomous Driving Model Training on AMD ROCm™ Software
Posts by Zhaofeng Zhang
17 February 2026 - Advanced MXFP4 Quantization: Combining Fine-Tuned Rotations with SmoothQuant for Near-Lossless Compression
29 October 2025 - High-Accuracy MXFP4, MXFP6, and Mixed-Precision Models on AMD GPUs
Posts by Zhe Li
07 January 2026 - Breaking the Accuracy-Speed Barrier: How MXFP4/6 Quantization Revolutionizes Image and Video Generation
09 September 2025 - Technical Dive into AMD’s MLPerf Inference v5.1 Submission
09 September 2025 - Slim Down Your Llama: Pruning & Fine-Tuning for Maximum Performance
09 September 2025 - Reproducing the AMD Instinct™ GPUs MLPerf Inference v5.1 Submission
Posts by Zhen Huang
16 December 2025 - MoE Training Best Practices on AMD GPUs
Posts by Zhenhua Liu
Posts by Zhenyu Gu
09 March 2026 - Agentic Diagnosis for LLM Training at Scale
23 February 2026 - Primus-Pipeline: A More Flexible and Scalable Pipeline Parallelism Implementation
16 December 2025 - MoE Training Best Practices on AMD GPUs
04 November 2025 - Stability at Scale: AMD’s Full‑Stack Platform for Large‑Model Training
Posts by Zhiquan Chen
Posts by Zhou Yu
19 March 2026 - hipBLASLt Online GEMM Tuning
Posts by Zhu Shan
Posts by Zicheng Liu
23 December 2025 - GEAK HIP: Expanding GEAK for HIP Code Optimization
06 December 2025 - Building a State-of-the-Art 32 Billion Reasoning Model with Only Synthetic Data on AMD GPUs
21 November 2025 - LuminaSFT: Generating Synthetic Fine-Tuning Data for Small Language Models
25 September 2025 - Day-0 Support for the SGLang-Native RL Framework - slime on AMD Instinct™ GPUs
01 August 2025 - GEAK: Introducing Triton Kernel AI Agent & Evaluation Benchmarks
15 July 2025 - Instella-T2I: Open-Source Text-to-Image with 1D Tokenizer and 32× Token Reduction on AMD GPUs
24 April 2025 - Reinforcement Learning from Human Feedback on AMD GPUs with verl and ROCm Integration
07 March 2025 - Instella-VL-1B: First AMD Vision Language Model
Posts by Ziqiong Liu
02 March 2026 - Streamlining Recommendation Model Training on AMD Instinct™ GPUs
23 December 2025 - GEAK HIP: Expanding GEAK for HIP Code Optimization
01 August 2025 - GEAK: Introducing Triton Kernel AI Agent & Evaluation Benchmarks
Posts by Zongheng Yang
13 November 2025 - Democratizing AI Compute with AMD Using SkyPilot