Posts by AMD Brevitas Team
Posts by AMD Quark Team
Posts by AMD Quark team
Posts by Aditya Bhattacharji
11 April 2025 - ROCm 6.4: Breaking Barriers in AI, HPC, and Modular GPU Software
Posts by Aditya Kumar Singh
07 March 2025 - Instella-VL-1B: First AMD Vision Language Model
Posts by Alessandro Fanfarillo
18 April 2024 - C++17 parallel algorithms and HIPSTDPAR
17 May 2023 - Register pressure in AMD CDNA™2 GPUs
Posts by Alex He
12 March 2025 - AMD Advances Enterprise AI Through OPEA Integration
13 February 2025 - Navigating vLLM Inference with ROCm and Kubernetes
Posts by Alex Voicu
18 April 2024 - C++17 parallel algorithms and HIPSTDPAR
Posts by Ammar Elwazir
Posts by Andy Luo
21 March 2025 - Supercharge DeepSeek-R1 Inference on AMD Instinct MI300X
21 February 2025 - Unlock DeepSeek-R1 Inference Performance on AMD Instinct™ MI300X GPU
Posts by Anshul Gupta
06 May 2025 - Unleash Full GPU Potential: Overlap Communication and Computation with Triton-Distributed
21 March 2025 - AITER: AI Tensor Engine For ROCm
14 March 2025 - Deploying Google’s Gemma 3 Model with vLLM on AMD Instinct™ MI300X GPUs: A Step-by-Step Guide
13 March 2025 - Optimized ROCm Docker for Distributed AI Training
06 February 2025 - GEMM Kernel Optimization For AMD GPUs
Posts by Anton Smirnov
16 April 2024 - Programming AMD GPUs with Julia
Posts by Asitav Mishra
13 May 2024 - Reading AMD GPU ISA
15 September 2023 - Jacobi Solver with HIP and OpenMP offloading
Posts by Babak Poursartip
28 February 2025 - Measuring Max-Achievable FLOPs – Part 2
Posts by Ben Sander
28 February 2025 - Measuring Max-Achievable FLOPs – Part 2
14 February 2025 - Understanding Peak, Max-Achievable & Delivered FLOPs, Part 1
Posts by Bill He
Posts by Bob Robey
26 April 2024 - Application portability with HIP
16 April 2024 - Affinity part 2 - System topology and controlling affinity
16 April 2024 - Affinity part 1 - Affinity, placement, and order
Posts by Brian Cornille
13 November 2024 - Introducing AMD’s Next-Gen Fortran Compiler
Posts by Brian Pickrell
08 January 2025 - Triton Inference Server with vLLM on AMD GPUs
Posts by Bruce Xue
Posts by Carlus Huang
21 March 2025 - AITER: AI Tensor Engine For ROCm
Posts by Chaitanya Manem
Posts by Chang Liu
24 March 2025 - Speculative Decoding - Deep Dive
Posts by Cheng Ling
Posts by Christophe Paquot
28 May 2025 - HIP 7.0 Is Coming: What You Need to Know to Stay Ahead
Posts by Clint Greene
11 October 2024 - Enhancing vLLM Inference on AMD GPUs
09 October 2024 - Supercharging JAX with Triton Kernels on AMD GPUs
23 September 2024 - Fine-tuning Llama 3 with Axolotl using ROCm on AMD GPUs
19 September 2024 - Inferencing and serving with vLLM on AMD GPUs
01 May 2024 - Inferencing with Mixtral 8x22B on AMD GPUs
16 April 2024 - Speech-to-Text on an AMD GPU with Whisper
15 April 2024 - Developing Triton Kernels on AMD GPUs
04 April 2024 - Retrieval Augmented Generation (RAG) using LlamaIndex
26 January 2024 - Accelerating XGBoost with Dask using multiple AMD GPUs
Posts by Corbin Robeck
13 May 2024 - Reading AMD GPU ISA
Posts by Danny Guan
11 April 2025 - ROCm Gets Modular: Meet the Instinct Datacenter GPU Driver
Posts by David Doscher
26 January 2023 - AMD ROCm™ installation
Posts by David Li
Posts by Denny Iriawan
28 May 2025 - HIP 7.0 Is Coming: What You Need to Know to Stay Ahead
Posts by Douglas Hamilton
22 May 2025 - ROCm Runfile Installer Is Here!
Posts by Douglas Jia
15 October 2024 - Multinode Fine-Tuning of Stable Diffusion XL on AMD GPUs with Hugging Face Accelerate and OCI’s Kubernetes Engine (OKE)
03 October 2024 - Leaner LLM Inference with INT8 Quantization on AMD GPUs using PyTorch
06 September 2024 - Optimize GPT Training: Enabling Mixed Precision Training in JAX using ROCm on AMD GPUs
22 July 2024 - Using statistical methods to reliably compare algorithm performance in large generative AI models with JAX Profiler on AMD GPUs
02 July 2024 - A Guide to Implementing and Training Generative Pre-trained Transformers (GPT) in JAX on AMD GPUs
17 April 2024 - Inferencing with AI2’s OLMo model on AMD GPU
11 April 2024 - GPU Unleashed: Training Reinforcement Learning Agents with Stable Baselines3 on an AMD GPU in Gymnasium Environment
23 February 2024 - Efficient image generation with Stable Diffusion models and ONNX Runtime using AMD GPUs
25 January 2024 - LLM distributed supervised fine-tuning with JAX
Posts by Ean Garvey
Posts by Eduardo Alvarez
Posts by Eliot Li
30 May 2025 - Scale LLM Inference with Multi-Node Infrastructure
08 January 2025 - Triton Inference Server with vLLM on AMD GPUs
28 August 2024 - Benchmarking Machine Learning using ROCm and AMD GPUs: Reproducing Our MLPerf Inference Submission
09 August 2024 - Inferencing with Grok-1 on AMD GPUs
04 April 2024 - Image classification using Vision Transformer with AMD GPUs
01 April 2024 - Scale AI applications with Ray
Posts by Emad Barsoum
28 April 2025 - Beyond Text: Accelerating Multimodal AI Inference with Speculative Decoding on AMD Instinct™ MI300X GPUs
07 March 2025 - Instella-VL-1B: First AMD Vision Language Model
31 January 2025 - Enhancing AI Training with AMD ROCm Software
Posts by Evan Masters
28 February 2025 - Measuring Max-Achievable FLOPs – Part 2
Posts by Fabricio Flores
06 May 2025 - CuPy and hipDF on AMD: The Basics and Beyond
19 February 2025 - Fine-tuning Phi-3.5-mini LLM at scale: Harnessing Accelerate and Slurm for multinode training
08 January 2025 - Triton Inference Server with vLLM on AMD GPUs
24 October 2024 - Torchtune on AMD GPUs How-To Guide: Fine-tuning and Scaling LLMs with Multi-GPU Power
29 July 2024 - Optimizing RoBERTa: Fine-Tuning with Mixed Precision on AMD
01 May 2024 - Step-by-Step Guide to Use OpenLLM on AMD GPUs
04 April 2024 - Building semantic search with SentenceTransformers on AMD
Posts by Fan Wu
Posts by Farshad Ghodsian
11 April 2025 - ROCm 6.4: Breaking Barriers in AI, HPC, and Modular GPU Software
28 March 2025 - What’s New in the AMD GPU Operator v1.2.0 Release
29 January 2025 - Announcing the AMD GPU Operator and Metrics Exporter
Posts by Ganesh Dasika
09 February 2025 - Deep dive into the MI300 compute and memory partition modes
Posts by Garrett Byrd
14 April 2025 - Installing ROCm from source with Spack
Posts by George Markomanolis
16 April 2024 - Affinity part 2 - System topology and controlling affinity
16 April 2024 - Affinity part 1 - Affinity, placement, and order
Posts by George Wang
06 May 2025 - Unleash Full GPU Potential: Overlap Communication and Computation with Triton-Distributed
06 February 2025 - GEMM Kernel Optimization For AMD GPUs
Posts by Gilbert Lee
Posts by Gina Sitaraman
26 April 2024 - Application portability with HIP
16 April 2024 - Affinity part 2 - System topology and controlling affinity
16 April 2024 - Affinity part 1 - Affinity, placement, and order
12 April 2023 - Introduction to profiling tools for AMD hardware
09 March 2023 - AMD Instinct™ MI200 GPU memory space overview
14 November 2022 - AMD matrix cores
Posts by Giuseppe Franco
Posts by Gowtham Ramesh
07 March 2025 - Instella-VL-1B: First AMD Vision Language Model
Posts by Hai Xiao
21 March 2025 - Supercharge DeepSeek-R1 Inference on AMD Instinct MI300X
Posts by Haocong Wang
Posts by Henry Ho
28 February 2025 - Measuring Max-Achievable FLOPs – Part 2
Posts by Hui Liu
Posts by Jassani Adeem
28 June 2024 - Mamba on AMD GPUs with ROCm
Posts by Jayacharan Kolla
11 April 2025 - ROCm 6.4: Breaking Barriers in AI, HPC, and Modular GPU Software
Posts by Jeremy Arnold
Posts by Jialian Wu
07 March 2025 - Instella-VL-1B: First AMD Vision Language Model
Posts by Jiang Liu
07 March 2025 - Instella-VL-1B: First AMD Vision Language Model
Posts by Joe Shajrawi
Posts by Johanna Potyka
13 November 2024 - Introducing AMD’s Next-Gen Fortran Compiler
Posts by Jorge Parada
30 May 2025 - Scale LLM Inference with Multi-Node Infrastructure
Posts by Joseph Schoonover
14 April 2025 - Installing ROCm from source with Spack
Posts by Julia Jiang
28 May 2025 - HIP 7.0 Is Coming: What You Need to Know to Stay Ahead
Posts by Justin Chang
09 February 2025 - MI300A - Exploring the APU advantage
13 November 2024 - Introducing AMD’s Next-Gen Fortran Compiler
29 August 2024 - Seismic stencil codes - part 3
29 August 2024 - Seismic stencil codes - part 2
29 August 2024 - Seismic stencil codes - part 1
15 September 2023 - Jacobi Solver with HIP and OpenMP offloading
18 July 2023 - Finite difference method - Laplacian part 4
11 May 2023 - Finite difference method - Laplacian part 3
04 January 2023 - Finite difference method - Laplacian part 2
14 November 2022 - Finite difference method - Laplacian part 1
Posts by Karan Verma
Posts by Karthik Sangaiah
09 February 2025 - Deep dive into the MI300 compute and memory partition modes
Posts by Kenny Roche
Posts by Kevin Chang
Posts by Kumar Deepak
Posts by Kyle Wang
Posts by Lei Shao
09 August 2024 - Inferencing with Grok-1 on AMD GPUs
Posts by Lei Zhang
Posts by Liam Berry
22 May 2025 - ROCm Runfile Installer Is Here!
Posts by Lingpeng Jin
21 March 2025 - AITER: AI Tensor Engine For ROCm
Posts by Liz Li
21 March 2025 - Supercharge DeepSeek-R1 Inference on AMD Instinct MI300X
21 March 2025 - AITER: AI Tensor Engine For ROCm
Posts by Logan Grado
03 July 2024 - Accelerating models on ROCm using PyTorch TunableOp
09 April 2024 - ResNet for image classification using AMD GPUs
01 April 2024 - Scale AI applications with Ray
29 March 2024 - Automatic mixed precision in PyTorch using AMD GPUs
Posts by Luise Chen
09 August 2024 - Inferencing with Grok-1 on AMD GPUs
Posts by Mahdi Ghodsi
Posts by Mahdieh Ghazimirsaeed
08 June 2023 - GPU-aware MPI with ROCm
Posts by Marco Grond
11 April 2025 - ROCm 6.4: Breaking Barriers in AI, HPC, and Modular GPU Software
Posts by Maria Ruiz Varela
26 April 2024 - Application portability with HIP
09 March 2023 - AMD Instinct™ MI200 GPU memory space overview
Posts by Martin Huarte
Posts by Matt Elliott
21 February 2025 - How to Build a vLLM Container for Inference and Benchmarking
29 January 2025 - Announcing the AMD GPU Operator and Metrics Exporter
17 September 2024 - Getting to Know Your GPU: A Deep Dive into AMD SMI
Posts by Meena Arunachalam
Posts by Michael Klemm
13 November 2024 - Introducing AMD’s Next-Gen Fortran Compiler
Posts by Michael Zhang
13 November 2024 - SGLang: Fast Serving Framework for Large Language and Vision-Language Models on AMD Instinct GPUs
24 October 2024 - CTranslate2: Efficient Inference with Transformer Models on AMD GPUs
Posts by Miro Hodak
Posts by Mohammad Mahdi Kamani
Posts by Moskvichev Arseny
28 June 2024 - Mamba on AMD GPUs with ROCm
Posts by Muhammad Osama
09 February 2025 - Deep dive into the MI300 compute and memory partition modes
29 July 2024 - Graph analytics on AMD GPUs using Gunrock
Posts by Nicholas Curtis
17 May 2023 - Register pressure in AMD CDNA™2 GPUs
Posts by Nicholas Malaya
14 November 2022 - AMD matrix cores
Posts by Ning Zhang
06 February 2025 - GEMM Kernel Optimization For AMD GPUs
Posts by No author
Posts by Noah Wolfe
12 April 2023 - Introduction to profiling tools for AMD hardware
Posts by Ossian O’’Reilly
29 August 2024 - Seismic stencil codes - part 3
29 August 2024 - Seismic stencil codes - part 2
29 August 2024 - Seismic stencil codes - part 1
11 May 2023 - Finite difference method - Laplacian part 3
04 January 2023 - Finite difference method - Laplacian part 2
14 November 2022 - Finite difference method - Laplacian part 1
14 November 2022 - AMD matrix cores
Posts by Parsa Fashi
Posts by Paul Mullowney
03 November 2023 - Sparse matrix vector multiplication - part 1
Posts by Pedram Alizadeh
Posts by Peng Sun
06 May 2025 - Unleash Full GPU Potential: Overlap Communication and Computation with Triton-Distributed
21 March 2025 - Supercharge DeepSeek-R1 Inference on AMD Instinct MI300X
Posts by Phillip Dang
11 July 2024 - DBRX Instruct on AMD GPUs
28 June 2024 - Deep Learning Recommendation Models on AMD GPUs
26 April 2024 - Table Question-Answering with TaPas
26 April 2024 - Multimodal (Visual and Language) understanding with LLaVA-NeXT
16 April 2024 - Text Summarization with FLAN-T5
16 April 2024 - Program Synthesis with CodeGen
08 April 2024 - Small language models with Phi-2
04 April 2024 - Using the ChatGLM-6B bilingual language model with AMD GPUs
12 March 2024 - Building a decoder transformer model on AMD GPU(s)
11 March 2024 - Question-answering Chatbot with LangChain on an AMD GPU
08 March 2024 - Music Generation With MusicGen on an AMD GPU
08 February 2024 - Simplifying deep learning: A guide to PyTorch Lightning
Posts by Poovaiah Palangappa
Posts by Prakamya Mishra
07 March 2025 - Instella-VL-1B: First AMD Vision Language Model
Posts by Pratik Prabhanjan Brahma
07 March 2025 - Instella-VL-1B: First AMD Vision Language Model
Posts by Quentin Anthony
10 December 2024 - Training Transformers and Hybrid models on AMD Instinct MI300X Accelerators
Posts by Rajat Arora
15 September 2023 - Jacobi Solver with HIP and OpenMP offloading
11 May 2023 - Finite difference method - Laplacian part 3
09 March 2023 - AMD Instinct™ MI200 GPU memory space overview
04 January 2023 - Finite difference method - Laplacian part 2
14 November 2022 - Finite difference method - Laplacian part 1
Posts by Rajneesh Bhardwaj
09 February 2025 - Deep dive into the MI300 compute and memory partition modes
Posts by Rathnakara Malatesha
25 February 2025 - Deploying Serverless AI Inference on AMD GPU Clusters
Posts by Rene Van Oostrum
14 November 2022 - AMD matrix cores
Posts by Rishi Madduri
Posts by Ronnie Chatterjee
11 April 2025 - ROCm 6.4: Breaking Barriers in AI, HPC, and Modular GPU Software
Posts by Ryan Swann
09 February 2025 - Deep dive into the MI300 compute and memory partition modes
Posts by Saad Rahim
28 May 2025 - HIP 7.0 Is Coming: What You Need to Know to Stay Ahead
22 May 2025 - ROCm Runfile Installer Is Here!
11 April 2025 - ROCm Gets Modular: Meet the Instinct Datacenter GPU Driver
11 April 2025 - ROCm 6.4: Breaking Barriers in AI, HPC, and Modular GPU Software
Posts by Sean Miller
18 July 2023 - Finite difference method - Laplacian part 4
11 May 2023 - Finite difference method - Laplacian part 3
09 March 2023 - AMD Instinct™ MI200 GPU memory space overview
04 January 2023 - Finite difference method - Laplacian part 2
14 November 2022 - Finite difference method - Laplacian part 1
Posts by Sean Song
09 February 2025 - PyTorch Fully Sharded Data Parallel (FSDP) on AMD GPUs with ROCm
24 January 2025 - Vision Mamba on AMD GPU with ROCm
01 November 2024 - Distributed Data Parallel Training on AMD GPU with ROCm
23 October 2024 - Inference with Llama 3.2 Vision LLMs on AMD GPUs Using ROCm
28 June 2024 - Mamba on AMD GPUs with ROCm
04 June 2024 - Segment Anything with AMD GPUs
15 April 2024 - Enhancing LLM Accessibility: A Deep Dive into QLoRA Through Fine-tuning Llama Model on a single AMD GPU
15 April 2024 - Enhancing LLM Accessibility: A Deep Dive into QLoRA Through Fine-tuning Llama 2 on a single AMD GPU
05 February 2024 - Using LoRA for efficient fine-tuning: Fundamental principles
Posts by Seungrok Jung
21 March 2025 - Supercharge DeepSeek-R1 Inference on AMD Instinct MI300X
15 March 2024 - Large language model inference optimizations on AMD GPUs
Posts by Shekhar Pandey
21 March 2025 - AITER: AI Tensor Engine For ROCm
Posts by Shenrun Zhang
Posts by Sonali Singh
09 February 2025 - Deep dive into the MI300 compute and memory partition modes
Posts by Sudhanshu Ranjan
07 March 2025 - Instella-VL-1B: First AMD Vision Language Model
Posts by Suyash Tandon
09 February 2025 - MI300A - Exploring the APU advantage
26 April 2024 - Application portability with HIP
12 April 2023 - Introduction to profiling tools for AMD hardware
Posts by Ted Themistokleous
08 January 2025 - Triton Inference Server with vLLM on AMD GPUs
Posts by Thomas Gibson
29 July 2024 - Graph analytics on AMD GPUs using Gunrock
18 July 2023 - Finite difference method - Laplacian part 4
11 May 2023 - Finite difference method - Laplacian part 3
12 April 2023 - Introduction to profiling tools for AMD hardware
04 January 2023 - Finite difference method - Laplacian part 2
14 November 2022 - Finite difference method - Laplacian part 1
Posts by Tiffany Mintz
08 January 2025 - Triton Inference Server with vLLM on AMD GPUs
Posts by Vara Lakshmi Bayanagari
28 January 2025 - Distributed fine-tuning of MPT-30B using Composer on AMD GPUs
03 December 2024 - Transformer based Encoder-Decoder models for image-captioning on AMD GPUs
13 November 2024 - Quantized 8-bit LLM training and inference using bitsandbytes on AMD GPUs
15 October 2024 - Speed Up Text Generation with Speculative Sampling on AMD GPUs
03 September 2024 - Image Classification with BEiT, MobileNet, and EfficientNet using ROCm on AMD GPUs
16 April 2024 - PyTorch C++ Extension on AMD GPU
04 April 2024 - Total body segmentation using MONAI Deploy on an AMD GPU
07 February 2024 - Two-dimensional images to three-dimensional scene mapping using NeRF on an AMD GPU
29 January 2024 - Pre-training BERT using Hugging Face & TensorFlow on an AMD GPU
26 January 2024 - Pre-training BERT using Hugging Face & PyTorch on an AMD GPU
Posts by Vicky Tsang
24 April 2025 - Reinforcement Learning from Human Feedback on AMD GPUs with verl and ROCm Integration
01 April 2024 - Scale AI applications with Ray
Posts by Victor Robles
14 February 2025 - AI Inference Orchestration with Kubernetes on Instinct MI300X, Part 2
07 February 2025 - AI Inference Orchestration with Kubernetes on Instinct MI300X, Part 1
Posts by Vikram Appia
Posts by Vish Vadlamani
08 January 2025 - Triton Inference Server with vLLM on AMD GPUs
Posts by Wei Cai
Posts by Wei-Ting Liao
Posts by Xiaodong Yu
07 March 2025 - Instella-VL-1B: First AMD Vision Language Model
Posts by Ximeng Sun
07 March 2025 - Instella-VL-1B: First AMD Vision Language Model
Posts by Yao Fehlis
11 September 2023 - Creating a PyTorch/TensorFlow code environment on AMD GPUs
Posts by Yao Fu
13 March 2025 - Optimized ROCm Docker for Distributed AI Training
Posts by Yao Liu
24 April 2025 - Reinforcement Learning from Human Feedback on AMD GPUs with verl and ROCm Integration
08 January 2025 - Triton Inference Server with vLLM on AMD GPUs
Posts by Yineng Zhang
Posts by Yu Wang
12 March 2025 - AMD Advances Enterprise AI Through OPEA Integration
Posts by Yusheng Su
24 April 2025 - Reinforcement Learning from Human Feedback on AMD GPUs with verl and ROCm Integration
07 March 2025 - Instella-VL-1B: First AMD Vision Language Model
Posts by Ze Wang
07 March 2025 - Instella-VL-1B: First AMD Vision Language Model
Posts by Zicheng Liu
24 April 2025 - Reinforcement Learning from Human Feedback on AMD GPUs with verl and ROCm Integration
07 March 2025 - Instella-VL-1B: First AMD Vision Language Model