Posts by Eliot Li

Image classification using Vision Transformer with AMD GPUs

The Vision Transformer (ViT) model was first proposed in An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale. ViT is an attractive alternative to conventional Convolutional Neural Network (CNN) models due to its excellent scalability and adaptability in the field of computer vision. On the other hand, ViT can be more expensive compared to CNN for large input images as it has quadratic computation complexity with respect to input size.

Read more ...

Scale AI applications with Ray

Most machine-learning (ML) workloads today require multiple GPUs or nodes to achieve the performance or scale required by applications. However, scaling workloads beyond single node/single GPU workloads is difficult and require some expertise in distributed processing.

Read more ...