Posts in Applications & models
DBRX Instruct on AMD GPUs
- 11 July 2024
In this blog, we showcase DBRX Instruct, a mixture-of-experts large language model developed by Databricks, on a ROCm-capable system with AMD GPUs.
Accelerate PyTorch Models using torch.compile on AMD GPUs with ROCm
- 11 July 2024
PyTorch 2.0 introduces torch.compile()
, a tool to vastly accelerate PyTorch code and models. By converting PyTorch code into highly optimized kernels, torch.compile
delivers substantial performance improvements with minimal changes to the existing codebase. This feature allows for precise optimization of individual functions, entire modules, and complex training loops, providing a versatile and powerful tool for enhancing computational efficiency.
Accelerating models on ROCm using PyTorch TunableOp
- 03 July 2024
In this blog, we will show how to leverage PyTorch TunableOp to accelerate models using ROCm on AMD GPUs. We will discuss the basics of General Matrix Multiplications (GEMMs), show an example of tuning a single GEMM, and finally, demonstrate real-world performance gains on an LLM (gemma) using TunableOp.
A Guide to Implementing and Training Generative Pre-trained Transformers (GPT) in JAX on AMD GPUs
- 02 July 2024
2 July, 2024 by Douglas Jia.
Mamba on AMD GPUs with ROCm
- 28 June 2024
28, Jun 2024 by Sean Song, Jassani Adeem, Moskvichev Arseny.
Unveiling performance insights with PyTorch Profiler on an AMD GPU
- 29 May 2024
29 May, 2024 by Phillip Dang.
Panoptic segmentation and instance segmentation with Detectron2 on AMD GPUs
- 23 May 2024
23, May 2024 by Vara Lakshmi Bayanagari.
Accelerating Large Language Models with Flash Attention on AMD GPUs
- 15 May 2024
15, May 2024 by Clint Greene.
Training a Neural Collaborative Filtering (NCF) Recommender on an AMD GPU
- 30 April 2024
30, Apr 2024 by Vara Lakshmi Bayanagari.
Multimodal (Visual and Language) understanding with LLaVA-NeXT
- 26 April 2024
26, Apr 2024 by Phillip Dang.
Unlocking Vision-Text Dual-Encoding: Multi-GPU Training of a CLIP-Like Model
- 24 April 2024
24 Apr, 2024 by Sean Song.
Transforming Words into Motion: A Guide to Video Generation with AMD GPU
- 24 April 2024
24 Apr, 2024 by Douglas Jia.
Interacting with Contrastive Language-Image Pre-Training (CLIP) model on AMD GPU
- 16 April 2024
16, Apr 2024 by Sean Song.
Instruction fine-tuning of StarCoder with PEFT on multiple AMD GPUs
- 16 April 2024
16 Apr, 2024 by Douglas Jia.
Enhancing LLM Accessibility: A Deep Dive into QLoRA Through Fine-tuning Llama 2 on a single AMD GPU
- 15 April 2024
15, Apr 2024 by Sean Song.
GPU Unleashed: Training Reinforcement Learning Agents with Stable Baselines3 on an AMD GPU in Gymnasium Environment
- 11 April 2024
11 Apr, 2024 by Douglas Jia.
Using the ChatGLM-6B bilingual language model with AMD GPUs
- 04 April 2024
4, Apr 2024 by Phillip Dang.
Total body segmentation using MONAI Deploy on an AMD GPU
- 04 April 2024
4, Apr 2024 by Vara Lakshmi Bayanagari.
Building semantic search with SentenceTransformers on AMD
- 04 April 2024
4 Apr, 2024 by Fabricio Flores.
Scale AI applications with Ray
- 01 April 2024
1, Apr 2024 by Vicky Tsang<vicktsan>, {hoverxref}Logan Grado, {hoverxref}
Eliot Li
Large language model inference optimizations on AMD GPUs
- 15 March 2024
15, Mar 2024 by Seungrok Jung.
Efficient image generation with Stable Diffusion models and ONNX Runtime using AMD GPUs
- 23 February 2024
23 Feb, 2024 by Douglas Jia.
Simplifying deep learning: A guide to PyTorch Lightning
- 08 February 2024
8, Feb 2024 by Phillip Dang.
Two-dimensional images to three-dimensional scene mapping using NeRF on an AMD GPU
- 07 February 2024
7, Feb 2024 by Vara Lakshmi Bayanagari.
Using LoRA for efficient fine-tuning: Fundamental principles
- 05 February 2024
5, Feb 2024 by Sean Song.
Fine-tune Llama 2 with LoRA: Customizing a large language model for question-answering
- 01 February 2024
1, Feb 2024 by Sean Song.
Pre-training BERT using Hugging Face & TensorFlow on an AMD GPU
- 29 January 2024
29, Jan 2024 by Vara Lakshmi Bayanagari.
Pre-training BERT using Hugging Face & PyTorch on an AMD GPU
- 26 January 2024
26, Jan 2024 by Vara Lakshmi Bayanagari.
Accelerating XGBoost with Dask using multiple AMD GPUs
- 26 January 2024
26 Jan, 2024 by Clint Greene.
Pre-training a large language model with Megatron-DeepSpeed on multiple AMD GPUs
- 24 January 2024
24 Jan, 2024 by Douglas Jia.
Efficient image generation with Stable Diffusion models and AITemplate using AMD GPUs
- 24 January 2024
24 Jan, 2024 by Douglas Jia.
Efficient deployment of large language models with Text Generation Inference on AMD GPUs
- 24 January 2024
24 Jan, 2024 by Douglas Jia.
Jacobi Solver with HIP and OpenMP offloading
- 15 September 2023
15 Sept, 2023 by Asitav Mishra, Rajat Arora, Justin Chang.
Finite difference method - Laplacian part 4
- 18 July 2023
18 Jul, 2023 by Justin Chang, Thomas Gibson, Sean Miller.
Finite difference method - Laplacian part 3
- 11 May 2023
11 May, 2023 by Justin Chang, Rajat Arora, Thomas Gibson, Sean Miller, Ossian O’Reilly.
Finite difference method - Laplacian part 2
- 04 January 2023
4 Jan, 2023 by Justin Chang, Rajat Arora, Thomas Gibson, Sean Miller, Ossian O’Reilly.
Finite difference method - Laplacian part 1
- 14 November 2022
14 Nov, 2022 by Justin Chang, Rajat Arora, Thomas Gibson, Sean Miller, Ossian O’Reilly.