Posts by Vara Lakshmi Bayanagari
Transformer based Encoder-Decoder models for image-captioning on AMD GPUs
- 03 December 2024
Image captioning, or the GenAI-based automatic generation of concise textual descriptions of images, has immensely important real-world applications. For example, image captioning can provide visually impaired users with textual descriptions of images for improved accessibility, image captioning can add textual descriptions to products in e-commerce applications and help children map images to their textual descriptions in early childhood educational apps. Image captioning can automatically describe objects and events in security camera footage in surveillance applications and can enable robots to auto-generate textual captions for objects and events they encountered in human-robot interaction (HRI) applications, and many more applications. Image captioning is a sequence-to-sequence (seq2seq) machine learning task: a model converting a sequence from one domain (in this case, the image), to another (its textual description). In image captioning the image is partitioned into a sequence of patches. This sequence of image patches is then converted by the model to a corresponding sequence of text tokens.
Quantized 8-bit LLM training and inference using bitsandbytes on AMD GPUs
- 13 November 2024
In this blog post we will cover the bitsandbytes 8-bit representations. As you will see, the bitsandbytes 8-bit representations significantly help reduce the memory needed for fine-tuning and inferencing LLMs. There are many quantization techniques used in the field to decrease a model size, but bitsandbytes offers quantization to decrease the size of optimizer states as well. This post will help you understand the basic principles underlying the bitsandbytes 8-bit representations, explain the bitsandbytes 8-bit optimizer and LLM.int8 techniques, and show you how to implement these on AMD GPUs using ROCm.
Speed Up Text Generation with Speculative Sampling on AMD GPUs
- 15 October 2024
As the size of transformer models grow, so does the cost of conducting inference, impacting latency and throughput. Compression methods such as quantization and distillation, as well as hardware-aware optimizations such as Flash Attention and Triton, have been proposed to cut down the computation cost at different levels. However, these models either compromise on accuracy or require major changes to the model implementation.
Image Classification with BEiT, MobileNet, and EfficientNet using ROCm on AMD GPUs
- 03 September 2024
Image classification is a key task in computer vision aiming at “understanding” an entire image. The outcome of an image classifier is a label or a category for the image as a whole, unlike object recognition where the task is to detect and classify multiple objects within an image.
Panoptic segmentation and instance segmentation with Detectron2 on AMD GPUs
- 23 May 2024
23, May 2024 by Vara Lakshmi Bayanagari.
Training a Neural Collaborative Filtering (NCF) Recommender on an AMD GPU
- 30 April 2024
30, Apr 2024 by Vara Lakshmi Bayanagari.
Total body segmentation using MONAI Deploy on an AMD GPU
- 04 April 2024
4, Apr 2024 by Vara Lakshmi Bayanagari.
Two-dimensional images to three-dimensional scene mapping using NeRF on an AMD GPU
- 07 February 2024
7, Feb 2024 by Vara Lakshmi Bayanagari.
Pre-training BERT using Hugging Face & TensorFlow on an AMD GPU
- 29 January 2024
29, Jan 2024 by Vara Lakshmi Bayanagari.
Pre-training BERT using Hugging Face & PyTorch on an AMD GPU
- 26 January 2024
26, Jan 2024 by Vara Lakshmi Bayanagari.