Posts by Vara Lakshmi Bayanagari

Quantized 8-bit LLM training and inference using bitsandbytes on AMD GPUs

In this blog post we will cover the bitsandbytes 8-bit representations. As you will see, the bitsandbytes 8-bit representations significantly help reduce the memory needed for fine-tuning and inferencing LLMs. There are many quantization techniques used in the field to decrease a model size, but bitsandbytes offers quantization to decrease the size of optimizer states as well. This post will help you understand the basic principles underlying the bitsandbytes 8-bit representations, explain the bitsandbytes 8-bit optimizer and LLM.int8 techniques, and show you how to implement these on AMD GPUs using ROCm.

Read more ...


Speed Up Text Generation with Speculative Sampling on AMD GPUs

As the size of transformer models grow, so does the cost of conducting inference, impacting latency and throughput. Compression methods such as quantization and distillation, as well as hardware-aware optimizations such as Flash Attention and Triton, have been proposed to cut down the computation cost at different levels. However, these models either compromise on accuracy or require major changes to the model implementation.

Read more ...


Image Classification with BEiT, MobileNet, and EfficientNet using ROCm on AMD GPUs

Image classification is a key task in computer vision aiming at “understanding” an entire image. The outcome of an image classifier is a label or a category for the image as a whole, unlike object recognition where the task is to detect and classify multiple objects within an image.

Read more ...


Panoptic segmentation and instance segmentation with Detectron2 on AMD GPUs

23, May 2024 by Vara Lakshmi Bayanagari.

Read more ...


Training a Neural Collaborative Filtering (NCF) Recommender on an AMD GPU

30, Apr 2024 by Vara Lakshmi Bayanagari.

Read more ...


PyTorch C++ Extension on AMD GPU

16, Apr 2024 by Vara Lakshmi Bayanagari.

Read more ...


Total body segmentation using MONAI Deploy on an AMD GPU

4, Apr 2024 by Vara Lakshmi Bayanagari.

Read more ...


Two-dimensional images to three-dimensional scene mapping using NeRF on an AMD GPU

7, Feb 2024 by Vara Lakshmi Bayanagari.

Read more ...


Pre-training BERT using Hugging Face & TensorFlow on an AMD GPU

29, Jan 2024 by Vara Lakshmi Bayanagari.

Read more ...


Pre-training BERT using Hugging Face & PyTorch on an AMD GPU

26, Jan 2024 by Vara Lakshmi Bayanagari.

Read more ...