AI Blogs - Page 10#
Navigating vLLM Inference with ROCm and Kubernetes
Quick introduction to Kubernetes (K8s) and a step-by-step guide on how to use K8s to deploy vLLM using ROCm.
PyTorch Fully Sharded Data Parallel (FSDP) on AMD GPUs with ROCm
This blog guides you through the process of using PyTorch FSDP to fine-tune LLMs efficiently on AMD GPUs.
AI Inference Orchestration with Kubernetes on Instinct MI300X, Part 1
This blog is part 1 of a series aimed at providing a comprehensive, step-by-step guide for deploying and scaling AI inference workloads with Kubernetes and the AMD GPU Operator on the AMD Instinct platform
GEMM Kernel Optimization For AMD GPUs
Guide to how GEMMs can be tuned for optimal performance of AI models on AMD GPUs
Enhancing AI Training with AMD ROCm Software
AMD's GPU training optimizations deliver peak performance for advanced AI models through ROCm software stack.
Best practices for competitive inference optimization on AMD Instinct™ MI300X GPUs
Learn how to optimize large language model inference using vLLM on AMD's MI300X GPUs for enhanced performance and efficiency.
Distributed fine-tuning of MPT-30B using Composer on AMD GPUs
This blog uses Composer, a distributed framework, on AMD GPUs to fine-tune MPT-30B in single node as well as multinode
Vision Mamba on AMD GPU with ROCm
This blog explores Vision Mamba (Vim), an innovative and efficient backbone for vision tasks and evaluate its performance on AMD GPUs with ROCm.
Getting started with AMD ROCm containers: from base images to custom solutions
This post, the second in a series, provides a walkthrough for building a vLLM container that can be used for both inference and benchmarking.
Triton Inference Server with vLLM on AMD GPUs
This blog provides a how-to guide on setting up a Triton Inference Server with vLLM backend powered by AMD GPUs, showcasing robust performance with several LLMs
Training Transformers and Hybrid models on AMD Instinct MI300X Accelerators
This blog shows Zyphra's new training kernels for transformers and hybrid models on AMD Instinct MI300X accelerators, surpassing the H100s performance
Transformer based Encoder-Decoder models for image-captioning on AMD GPUs
The blog introduces image captioning and provides hands-on tutorials on three different Transformer-based encoder-decoder image captioning models: ViT-GPT2, BLIP, and Alpha- CLIP, deployed on AMD GPUs using ROCm.