Posts by Matt Elliott
How to Build a vLLM Container for Inference and Benchmarking
- 21 February 2025
Welcome back! If you’ve been following along with this series, you’ve already learned about the basics of ROCm containers. Today, we’ll build on that foundation by creating a container for large language model inference with vLLM.
Announcing the AMD GPU Operator and Metrics Exporter
- 29 January 2025
As AI workloads continue to grow in complexity and scale, we’ve consistently heard one thing from our customers: “Managing GPU infrastructure shouldn’t be the hard part”. For many, this is where Kubernetes comes into play. Kubernetes allows customers to easily manage and deploy their AI workloads at scale by providing a robust platform for automating deployment, scaling, and operations of application containers across clusters of hosts. It ensures that your applications run consistently and reliably, regardless of the underlying infrastructure. A pod is the smallest and simplest Kubernetes object. It represents a single instance of a running process in your cluster and can contain one or more containers. Pods are used to host your application workloads and are managed by Kubernetes to ensure they run as expected. Having pods be able to leverage GPUs on your cluster, however, is not something that is trivial.
Getting started with AMD ROCm containers: from base images to custom solutions
- 16 January 2025
Having worked in technology for over two decades, I’ve witnessed firsthand how containerization has transformed the way we develop and deploy applications. Containers package applications with their dependencies into standardized units, making software portable and consistent across different environments. When we combine this containerization power with AMD Instinct™ Accelerators, we get a powerful solution for quickly deploying AI and machine learning workloads. In this blog, the first in a series exploring ROCm containerization, I want to share my insights about AMD ROCm™ containers and show you how to build and customize your own GPU-accelerated workloads. You’ll learn how to select appropriate base images, modify containers for your specific needs, and implement best practices for GPU-enabled containerization - all with hands-on examples you can try yourself.
Getting to Know Your GPU: A Deep Dive into AMD SMI
- 17 September 2024
For system administrators and power users working with AMD hardware, performance optimization and efficient monitoring of resources is paramount. The AMD System Management Interface command-line tool, amd-smi
, addresses these needs.
Introducing the AMD ROCm™ Offline Installer Creator: Simplifying Deployment for AI and HPC
- 10 September 2024
Document headings start at H2, not H1 [myst.header]