Posts by AMD Brevitas Team
Posts by AMD Quark Team
Posts by AMD Quark team
Posts by Aditya Bhattacharji
11 April 2025 - ROCm 6.4: Breaking Barriers in AI, HPC, and Modular GPU Software
Posts by Aditya Kumar Singh
07 March 2025 - Instella-VL-1B: First AMD Vision Language Model
Posts by Alessandro Fanfarillo
18 April 2024 - C++17 parallel algorithms and HIPSTDPAR
17 May 2023 - Register pressure in AMD CDNA™2 GPUs
Posts by Alex He
12 March 2025 - AMD Advances Enterprise AI Through OPEA Integration
13 February 2025 - Navigating vLLM Inference with ROCm and Kubernetes
Posts by Alex Voicu
18 April 2024 - C++17 parallel algorithms and HIPSTDPAR
Posts by Ammar Elwazir
Posts by Andy Luo
21 March 2025 - Supercharge DeepSeek-R1 Inference on AMD Instinct MI300X
21 February 2025 - Unlock DeepSeek-R1 Inference Performance on AMD Instinct™ MI300X GPU
Posts by Anshul Gupta
21 March 2025 - AITER: AI Tensor Engine For ROCm
21 March 2025 - AITER: AI Tensor Engine For ROCm
14 March 2025 - Deploying Google’s Gemma 3 Model with vLLM on AMD Instinct™ MI300X GPUs: A Step-by-Step Guide
13 March 2025 - Optimized ROCm Docker for Distributed AI Training
06 February 2025 - GEMM Kernel Optimization For AMD GPUs
Posts by Anton Smirnov
16 April 2024 - Programming AMD GPUs with Julia
Posts by Asitav Mishra
13 May 2024 - Reading AMD GPU ISA
15 September 2023 - Jacobi Solver with HIP and OpenMP offloading
Posts by Babak Poursartip
28 February 2025 - Measuring Max-Achievable FLOPs – Part 2
Posts by Ben Sander
28 February 2025 - Measuring Max-Achievable FLOPs – Part 2
14 February 2025 - Understanding Peak, Max-Achievable & Delivered FLOPs, Part 1
Posts by Bill He
Posts by Bob Robey
26 April 2024 - Application portability with HIP
16 April 2024 - Affinity part 2 - System topology and controlling affinity
16 April 2024 - Affinity part 1 - Affinity, placement, and order
Posts by Brian Cornille
13 November 2024 - Introducing AMD’s Next-Gen Fortran Compiler
Posts by Brian Pickrell
08 January 2025 - Triton Inference Server with vLLM on AMD GPUs
Posts by Carlus Huang
21 March 2025 - AITER: AI Tensor Engine For ROCm
21 March 2025 - AITER: AI Tensor Engine For ROCm
Posts by Chaitanya Manem
Posts by Chang Liu
24 March 2025 - Speculative Decoding - Deep Dive
Posts by Cheng Ling
Posts by Clint Greene
11 October 2024 - Enhancing vLLM Inference on AMD GPUs
09 October 2024 - Supercharging JAX with Triton Kernels on AMD GPUs
23 September 2024 - Fine-tuning Llama 3 with Axolotl using ROCm on AMD GPUs
19 September 2024 - Inferencing and serving with vLLM on AMD GPUs
01 May 2024 - Inferencing with Mixtral 8x22B on AMD GPUs
16 April 2024 - Speech-to-Text on an AMD GPU with Whisper
15 April 2024 - Developing Triton Kernels on AMD GPUs
04 April 2024 - Retrieval Augmented Generation (RAG) using LlamaIndex
26 January 2024 - Accelerating XGBoost with Dask using multiple AMD GPUs
Posts by Corbin Robeck
13 May 2024 - Reading AMD GPU ISA
Posts by Danny Guan
11 April 2025 - ROCm Gets Modular: Meet the Instinct Datacenter GPU Driver
Posts by David Doscher
26 January 2023 - AMD ROCm™ installation
Posts by David Li
Posts by Douglas Jia
15 October 2024 - Multinode Fine-Tuning of Stable Diffusion XL on AMD GPUs with Hugging Face Accelerate and OCI’s Kubernetes Engine (OKE)
03 October 2024 - Leaner LLM Inference with INT8 Quantization on AMD GPUs using PyTorch
06 September 2024 - Optimize GPT Training: Enabling Mixed Precision Training in JAX using ROCm on AMD GPUs
22 July 2024 - Using statistical methods to reliably compare algorithm performance in large generative AI models with JAX Profiler on AMD GPUs
02 July 2024 - A Guide to Implementing and Training Generative Pre-trained Transformers (GPT) in JAX on AMD GPUs
17 April 2024 - Inferencing with AI2’s OLMo model on AMD GPU
11 April 2024 - GPU Unleashed: Training Reinforcement Learning Agents with Stable Baselines3 on an AMD GPU in Gymnasium Environment
23 February 2024 - Efficient image generation with Stable Diffusion models and ONNX Runtime using AMD GPUs
25 January 2024 - LLM distributed supervised fine-tuning with JAX
Posts by Ean Garvey
Posts by Eduardo Alvarez
Posts by Eliot Li
08 January 2025 - Triton Inference Server with vLLM on AMD GPUs
28 August 2024 - Benchmarking Machine Learning using ROCm and AMD GPUs: Reproducing Our MLPerf Inference Submission
09 August 2024 - Inferencing with Grok-1 on AMD GPUs
04 April 2024 - Image classification using Vision Transformer with AMD GPUs
01 April 2024 - Scale AI applications with Ray
Posts by Emad Barsoum
28 April 2025 - Beyond Text: Accelerating Multimodal AI Inference with Speculative Decoding on AMD Instinct™ MI300X GPUs
07 March 2025 - Instella-VL-1B: First AMD Vision Language Model
31 January 2025 - Enhancing AI Training with AMD ROCm Software
Posts by Evan Masters
28 February 2025 - Measuring Max-Achievable FLOPs – Part 2
Posts by Fabricio Flores
19 February 2025 - Fine-tuning Phi-3.5-mini LLM at scale: Harnessing Accelerate and Slurm for multinode training
08 January 2025 - Triton Inference Server with vLLM on AMD GPUs
24 October 2024 - Torchtune on AMD GPUs How-To Guide: Fine-tuning and Scaling LLMs with Multi-GPU Power
29 July 2024 - Optimizing RoBERTa: Fine-Tuning with Mixed Precision on AMD
01 May 2024 - Step-by-Step Guide to Use OpenLLM on AMD GPUs
04 April 2024 - Building semantic search with SentenceTransformers on AMD
Posts by Farshad Ghodsian
11 April 2025 - ROCm 6.4: Breaking Barriers in AI, HPC, and Modular GPU Software
28 March 2025 - What’s New in the AMD GPU Operator v1.2.0 Release
29 January 2025 - Announcing the AMD GPU Operator and Metrics Exporter
Posts by Ganesh Dasika
09 February 2025 - Deep dive into the MI300 compute and memory partition modes
Posts by Garrett Byrd
14 April 2025 - Installing ROCm from source with Spack
Posts by George Markomanolis
16 April 2024 - Affinity part 2 - System topology and controlling affinity
16 April 2024 - Affinity part 1 - Affinity, placement, and order
Posts by George Wang
06 February 2025 - GEMM Kernel Optimization For AMD GPUs
Posts by Gilbert Lee
Posts by Gina Sitaraman
26 April 2024 - Application portability with HIP
16 April 2024 - Affinity part 2 - System topology and controlling affinity
16 April 2024 - Affinity part 1 - Affinity, placement, and order
12 April 2023 - Introduction to profiling tools for AMD hardware
09 March 2023 - AMD Instinct™ MI200 GPU memory space overview
14 November 2022 - AMD matrix cores
Posts by Gowtham Ramesh
07 March 2025 - Instella-VL-1B: First AMD Vision Language Model
Posts by Hai Xiao
21 March 2025 - Supercharge DeepSeek-R1 Inference on AMD Instinct MI300X
Posts by Henry Ho
28 February 2025 - Measuring Max-Achievable FLOPs – Part 2
Posts by Hui Liu
Posts by Jassani Adeem
28 June 2024 - Mamba on AMD GPUs with ROCm
Posts by Jayacharan Kolla
11 April 2025 - ROCm 6.4: Breaking Barriers in AI, HPC, and Modular GPU Software
Posts by Jeremy Arnold
Posts by Jialian Wu
07 March 2025 - Instella-VL-1B: First AMD Vision Language Model
Posts by Jiang Liu
07 March 2025 - Instella-VL-1B: First AMD Vision Language Model
Posts by Johanna Potyka
13 November 2024 - Introducing AMD’s Next-Gen Fortran Compiler
Posts by Joseph Schoonover
14 April 2025 - Installing ROCm from source with Spack
Posts by Justin Chang
09 February 2025 - MI300A - Exploring the APU advantage
13 November 2024 - Introducing AMD’s Next-Gen Fortran Compiler
29 August 2024 - Seismic stencil codes - part 3
29 August 2024 - Seismic stencil codes - part 2
29 August 2024 - Seismic stencil codes - part 1
15 September 2023 - Jacobi Solver with HIP and OpenMP offloading
18 July 2023 - Finite difference method - Laplacian part 4
11 May 2023 - Finite difference method - Laplacian part 3
04 January 2023 - Finite difference method - Laplacian part 2
14 November 2022 - Finite difference method - Laplacian part 1
Posts by Karan Verma
Posts by Karthik Sangaiah
09 February 2025 - Deep dive into the MI300 compute and memory partition modes
Posts by Kumar Deepak
Posts by Lei Shao
09 August 2024 - Inferencing with Grok-1 on AMD GPUs
Posts by Lingpeng Jin
21 March 2025 - AITER: AI Tensor Engine For ROCm
21 March 2025 - AITER: AI Tensor Engine For ROCm
Posts by Liz Li
21 March 2025 - Supercharge DeepSeek-R1 Inference on AMD Instinct MI300X
21 March 2025 - AITER: AI Tensor Engine For ROCm
21 March 2025 - AITER: AI Tensor Engine For ROCm
Posts by Logan Grado
03 July 2024 - Accelerating models on ROCm using PyTorch TunableOp
09 April 2024 - ResNet for image classification using AMD GPUs
01 April 2024 - Scale AI applications with Ray
29 March 2024 - Automatic mixed precision in PyTorch using AMD GPUs
Posts by Luise Chen
09 August 2024 - Inferencing with Grok-1 on AMD GPUs
Posts by Mahdi Ghodsi
Posts by Mahdieh Ghazimirsaeed
08 June 2023 - GPU-aware MPI with ROCm
Posts by Marco Grond
11 April 2025 - ROCm 6.4: Breaking Barriers in AI, HPC, and Modular GPU Software
Posts by Maria Ruiz Varela
26 April 2024 - Application portability with HIP
09 March 2023 - AMD Instinct™ MI200 GPU memory space overview
Posts by Martin Huarte
Posts by Matt Elliott
21 February 2025 - How to Build a vLLM Container for Inference and Benchmarking
29 January 2025 - Announcing the AMD GPU Operator and Metrics Exporter
17 September 2024 - Getting to Know Your GPU: A Deep Dive into AMD SMI
Posts by Meena Arunachalam
Posts by Michael Klemm
13 November 2024 - Introducing AMD’s Next-Gen Fortran Compiler
Posts by Michael Zhang
13 November 2024 - SGLang: Fast Serving Framework for Large Language and Vision-Language Models on AMD Instinct GPUs
24 October 2024 - CTranslate2: Efficient Inference with Transformer Models on AMD GPUs
Posts by Miro Hodak
Posts by Mohammad Mahdi Kamani
Posts by Moskvichev Arseny
28 June 2024 - Mamba on AMD GPUs with ROCm
Posts by Muhammad Osama
09 February 2025 - Deep dive into the MI300 compute and memory partition modes
29 July 2024 - Graph analytics on AMD GPUs using Gunrock
Posts by Nicholas Curtis
17 May 2023 - Register pressure in AMD CDNA™2 GPUs
Posts by Nicholas Malaya
14 November 2022 - AMD matrix cores
Posts by Ning Zhang
06 February 2025 - GEMM Kernel Optimization For AMD GPUs
Posts by No author
Posts by Noah Wolfe
12 April 2023 - Introduction to profiling tools for AMD hardware
Posts by Ossian O’’Reilly
29 August 2024 - Seismic stencil codes - part 3
29 August 2024 - Seismic stencil codes - part 2
29 August 2024 - Seismic stencil codes - part 1
11 May 2023 - Finite difference method - Laplacian part 3
04 January 2023 - Finite difference method - Laplacian part 2
14 November 2022 - Finite difference method - Laplacian part 1
14 November 2022 - AMD matrix cores
Posts by Parsa Fashi
Posts by Paul Mullowney
03 November 2023 - Sparse matrix vector multiplication - part 1
Posts by Pedram Alizadeh
Posts by Peng Sun
21 March 2025 - Supercharge DeepSeek-R1 Inference on AMD Instinct MI300X
Posts by Phillip Dang
11 July 2024 - DBRX Instruct on AMD GPUs
28 June 2024 - Deep Learning Recommendation Models on AMD GPUs
26 April 2024 - Table Question-Answering with TaPas
26 April 2024 - Multimodal (Visual and Language) understanding with LLaVA-NeXT
16 April 2024 - Text Summarization with FLAN-T5
16 April 2024 - Program Synthesis with CodeGen
08 April 2024 - Small language models with Phi-2
04 April 2024 - Using the ChatGLM-6B bilingual language model with AMD GPUs
12 March 2024 - Building a decoder transformer model on AMD GPU(s)
11 March 2024 - Question-answering Chatbot with LangChain on an AMD GPU
08 March 2024 - Music Generation With MusicGen on an AMD GPU
08 February 2024 - Simplifying deep learning: A guide to PyTorch Lightning
Posts by Poovaiah Palangappa
Posts by Prakamya Mishra
07 March 2025 - Instella-VL-1B: First AMD Vision Language Model
Posts by Pratik Prabhanjan Brahma
07 March 2025 - Instella-VL-1B: First AMD Vision Language Model
Posts by Quentin Anthony
10 December 2024 - Training Transformers and Hybrid models on AMD Instinct MI300X Accelerators
Posts by Rajat Arora
15 September 2023 - Jacobi Solver with HIP and OpenMP offloading
11 May 2023 - Finite difference method - Laplacian part 3
09 March 2023 - AMD Instinct™ MI200 GPU memory space overview
04 January 2023 - Finite difference method - Laplacian part 2
14 November 2022 - Finite difference method - Laplacian part 1
Posts by Rajneesh Bhardwaj
09 February 2025 - Deep dive into the MI300 compute and memory partition modes
Posts by Rathnakara Malatesha
25 February 2025 - Deploying Serverless AI Inference on AMD GPU Clusters
Posts by Rene Van Oostrum
14 November 2022 - AMD matrix cores
Posts by Rishi Madduri
Posts by Ronnie Chatterjee
11 April 2025 - ROCm 6.4: Breaking Barriers in AI, HPC, and Modular GPU Software
Posts by Ryan Swann
09 February 2025 - Deep dive into the MI300 compute and memory partition modes
Posts by Saad Rahim
11 April 2025 - ROCm Gets Modular: Meet the Instinct Datacenter GPU Driver
11 April 2025 - ROCm 6.4: Breaking Barriers in AI, HPC, and Modular GPU Software
Posts by Sean Miller
18 July 2023 - Finite difference method - Laplacian part 4
11 May 2023 - Finite difference method - Laplacian part 3
09 March 2023 - AMD Instinct™ MI200 GPU memory space overview
04 January 2023 - Finite difference method - Laplacian part 2
14 November 2022 - Finite difference method - Laplacian part 1
Posts by Sean Song
09 February 2025 - PyTorch Fully Sharded Data Parallel (FSDP) on AMD GPUs with ROCm
24 January 2025 - Vision Mamba on AMD GPU with ROCm
01 November 2024 - Distributed Data Parallel Training on AMD GPU with ROCm
23 October 2024 - Inference with Llama 3.2 Vision LLMs on AMD GPUs Using ROCm
28 June 2024 - Mamba on AMD GPUs with ROCm
04 June 2024 - Segment Anything with AMD GPUs
15 April 2024 - Enhancing LLM Accessibility: A Deep Dive into QLoRA Through Fine-tuning Llama Model on a single AMD GPU
15 April 2024 - Enhancing LLM Accessibility: A Deep Dive into QLoRA Through Fine-tuning Llama 2 on a single AMD GPU
05 February 2024 - Using LoRA for efficient fine-tuning: Fundamental principles
Posts by Seungrok Jung
21 March 2025 - Supercharge DeepSeek-R1 Inference on AMD Instinct MI300X
15 March 2024 - Large language model inference optimizations on AMD GPUs
Posts by Shekhar Pandey
21 March 2025 - AITER: AI Tensor Engine For ROCm
21 March 2025 - AITER: AI Tensor Engine For ROCm
Posts by Shenrun Zhang
Posts by Sonali Singh
09 February 2025 - Deep dive into the MI300 compute and memory partition modes
Posts by Sudhanshu Ranjan
07 March 2025 - Instella-VL-1B: First AMD Vision Language Model
Posts by Suyash Tandon
09 February 2025 - MI300A - Exploring the APU advantage
26 April 2024 - Application portability with HIP
12 April 2023 - Introduction to profiling tools for AMD hardware
Posts by Ted Themistokleous
08 January 2025 - Triton Inference Server with vLLM on AMD GPUs
Posts by Thomas Gibson
29 July 2024 - Graph analytics on AMD GPUs using Gunrock
18 July 2023 - Finite difference method - Laplacian part 4
11 May 2023 - Finite difference method - Laplacian part 3
12 April 2023 - Introduction to profiling tools for AMD hardware
04 January 2023 - Finite difference method - Laplacian part 2
14 November 2022 - Finite difference method - Laplacian part 1
Posts by Tiffany Mintz
08 January 2025 - Triton Inference Server with vLLM on AMD GPUs
Posts by Vara Lakshmi Bayanagari
28 January 2025 - Distributed fine-tuning of MPT-30B using Composer on AMD GPUs
03 December 2024 - Transformer based Encoder-Decoder models for image-captioning on AMD GPUs
13 November 2024 - Quantized 8-bit LLM training and inference using bitsandbytes on AMD GPUs
15 October 2024 - Speed Up Text Generation with Speculative Sampling on AMD GPUs
03 September 2024 - Image Classification with BEiT, MobileNet, and EfficientNet using ROCm on AMD GPUs
16 April 2024 - PyTorch C++ Extension on AMD GPU
04 April 2024 - Total body segmentation using MONAI Deploy on an AMD GPU
07 February 2024 - Two-dimensional images to three-dimensional scene mapping using NeRF on an AMD GPU
29 January 2024 - Pre-training BERT using Hugging Face & TensorFlow on an AMD GPU
26 January 2024 - Pre-training BERT using Hugging Face & PyTorch on an AMD GPU
Posts by Vicky Tsang
24 April 2025 - Reinforcement Learning from Human Feedback on AMD GPUs with verl and ROCm Integration
01 April 2024 - Scale AI applications with Ray
Posts by Victor Robles
14 February 2025 - AI Inference Orchestration with Kubernetes on Instinct MI300X, Part 2
07 February 2025 - AI Inference Orchestration with Kubernetes on Instinct MI300X, Part 1
Posts by Vikram Appia
Posts by Vish Vadlamani
08 January 2025 - Triton Inference Server with vLLM on AMD GPUs
Posts by Wei-Ting Liao
Posts by Xiaodong Yu
07 March 2025 - Instella-VL-1B: First AMD Vision Language Model
Posts by Ximeng Sun
07 March 2025 - Instella-VL-1B: First AMD Vision Language Model
Posts by Yao Fehlis
11 September 2023 - Creating a PyTorch/TensorFlow code environment on AMD GPUs
Posts by Yao Fu
13 March 2025 - Optimized ROCm Docker for Distributed AI Training
Posts by Yao Liu
24 April 2025 - Reinforcement Learning from Human Feedback on AMD GPUs with verl and ROCm Integration
08 January 2025 - Triton Inference Server with vLLM on AMD GPUs
Posts by Yineng Zhang
Posts by Yu Wang
12 March 2025 - AMD Advances Enterprise AI Through OPEA Integration
Posts by Yusheng Su
24 April 2025 - Reinforcement Learning from Human Feedback on AMD GPUs with verl and ROCm Integration
07 March 2025 - Instella-VL-1B: First AMD Vision Language Model
Posts by Ze Wang
07 March 2025 - Instella-VL-1B: First AMD Vision Language Model
Posts by Zicheng Liu
24 April 2025 - Reinforcement Learning from Human Feedback on AMD GPUs with verl and ROCm Integration
07 March 2025 - Instella-VL-1B: First AMD Vision Language Model