Posts tagged PyTorch

DBRX Instruct on AMD GPUs

In this blog, we showcase DBRX Instruct, a mixture-of-experts large language model developed by Databricks, on a ROCm-capable system with AMD GPUs.

Read more ...


Accelerate PyTorch Models using torch.compile on AMD GPUs with ROCm

PyTorch 2.0 introduces torch.compile(), a tool to vastly accelerate PyTorch code and models. By converting PyTorch code into highly optimized kernels, torch.compile delivers substantial performance improvements with minimal changes to the existing codebase. This feature allows for precise optimization of individual functions, entire modules, and complex training loops, providing a versatile and powerful tool for enhancing computational efficiency.

Read more ...


Accelerating models on ROCm using PyTorch TunableOp

In this blog, we will show how to leverage PyTorch TunableOp to accelerate models using ROCm on AMD GPUs. We will discuss the basics of General Matrix Multiplications (GEMMs), show an example of tuning a single GEMM, and finally, demonstrate real-world performance gains on an LLM (gemma) using TunableOp.

Read more ...


A Guide to Implementing and Training Generative Pre-trained Transformers (GPT) in JAX on AMD GPUs

2 July, 2024 by Douglas Jia.

Read more ...


Mamba on AMD GPUs with ROCm

28, Jun 2024 by Sean Song, Jassani Adeem, Moskvichev Arseny.

Read more ...


Deep Learning Recommendation Models on AMD GPUs

28, June 2024 by Phillip Dang.

Read more ...


Unveiling performance insights with PyTorch Profiler on an AMD GPU

29 May, 2024 by Phillip Dang.

Read more ...


Panoptic segmentation and instance segmentation with Detectron2 on AMD GPUs

23, May 2024 by Vara Lakshmi Bayanagari.

Read more ...


Accelerating Large Language Models with Flash Attention on AMD GPUs

15, May 2024 by Clint Greene.

Read more ...


Table Question-Answering with TaPas

26 Apr, 2024 by Phillip Dang.

Read more ...


Multimodal (Visual and Language) understanding with LLaVA-NeXT

26, Apr 2024 by Phillip Dang.

Read more ...


Unlocking Vision-Text Dual-Encoding: Multi-GPU Training of a CLIP-Like Model

24 Apr, 2024 by Sean Song.

Read more ...


Transforming Words into Motion: A Guide to Video Generation with AMD GPU

24 Apr, 2024 by Douglas Jia.

Read more ...


Inferencing with AI2’s OLMo model on AMD GPU

17 Apr, 2024 by Douglas Jia.

Read more ...


Text Summarization with FLAN-T5

16, Apr 2024 by Phillip Dang.

Read more ...


PyTorch C++ Extension on AMD GPU

16, Apr 2024 by Vara Lakshmi Bayanagari.

Read more ...


Program Synthesis with CodeGen

16, Apr 2024 by Phillip Dang.

Read more ...


Instruction fine-tuning of StarCoder with PEFT on multiple AMD GPUs

16 Apr, 2024 by Douglas Jia.

Read more ...


GPU Unleashed: Training Reinforcement Learning Agents with Stable Baselines3 on an AMD GPU in Gymnasium Environment

11 Apr, 2024 by Douglas Jia.

Read more ...


ResNet for image classification using AMD GPUs

9 Apr, 2024 by Logan Grado.

Read more ...


Small language models with Phi-2

8, Apr 2024 by Phillip Dang.

Read more ...


Using the ChatGLM-6B bilingual language model with AMD GPUs

4, Apr 2024 by Phillip Dang.

Read more ...


Total body segmentation using MONAI Deploy on an AMD GPU

4, Apr 2024 by Vara Lakshmi Bayanagari.

Read more ...


Automatic mixed precision in PyTorch using AMD GPUs

29, March 2024 by Logan Grado.

Read more ...


Building a decoder transformer model on AMD GPU(s)

12, Mar 2024 by Phillip Dang.

Read more ...


Question-answering Chatbot with LangChain on an AMD GPU

11, Mar 2024 by Phillip Dang.

Read more ...


Music Generation With MusicGen on an AMD GPU

8, Mar 2024 by Phillip Dang.

Read more ...


Efficient image generation with Stable Diffusion models and ONNX Runtime using AMD GPUs

23 Feb, 2024 by Douglas Jia.

Read more ...


Simplifying deep learning: A guide to PyTorch Lightning

8, Feb 2024 by Phillip Dang.

Read more ...


Two-dimensional images to three-dimensional scene mapping using NeRF on an AMD GPU

7, Feb 2024 by Vara Lakshmi Bayanagari.

Read more ...


Using LoRA for efficient fine-tuning: Fundamental principles

5, Feb 2024 by Sean Song.

Read more ...


Pre-training BERT using Hugging Face & PyTorch on an AMD GPU

26, Jan 2024 by Vara Lakshmi Bayanagari.

Read more ...


Pre-training a large language model with Megatron-DeepSpeed on multiple AMD GPUs

24 Jan, 2024 by Douglas Jia.

Read more ...


Creating a PyTorch/TensorFlow code environment on AMD GPUs

Note: This blog was previously part of the AMD lab notes blog series.

Read more ...