Posts by Matt Elliott

How to Build a vLLM Container for Inference and Benchmarking

21 February 2025

Welcome back! If you’ve been following along with this series, you’ve already learned about the basics of ROCm containers. Today, we’ll build on that foundation by creating a container for large language model inference with vLLM.

Read more ...

Announcing the AMD GPU Operator and Metrics Exporter

29 January 2025

As AI workloads continue to grow in complexity and scale, we’ve consistently heard one thing from our customers: “Managing GPU infrastructure shouldn’t be the hard part”. For many, this is where Kubernetes comes into play. Kubernetes allows customers to easily manage and deploy their AI workloads at scale by providing a robust platform for automating deployment, scaling, and operations of application containers across clusters of hosts. It ensures that your applications run consistently and reliably, regardless of the underlying infrastructure. A pod is the smallest and simplest Kubernetes object. It represents a single instance of a running process in your cluster and can contain one or more containers. Pods are used to host your application workloads and are managed by Kubernetes to ensure they run as expected. Having pods be able to leverage GPUs on your cluster, however, is not something that is trivial.

Read more ...

Getting started with AMD ROCm containers: from base images to custom solutions

16 January 2025

Having worked in technology for over two decades, I’ve witnessed firsthand how containerization has transformed the way we develop and deploy applications. Containers package applications with their dependencies into standardized units, making software portable and consistent across different environments. When we combine this containerization power with AMD Instinct™ Accelerators, we get a powerful solution for quickly deploying AI and machine learning workloads. In this blog, the first in a series exploring ROCm containerization, I want to share my insights about AMD ROCm™ containers and show you how to build and customize your own GPU-accelerated workloads. You’ll learn how to select appropriate base images, modify containers for your specific needs, and implement best practices for GPU-enabled containerization - all with hands-on examples you can try yourself.

Read more ...

Getting to Know Your GPU: A Deep Dive into AMD SMI

17 September 2024

For system administrators and power users working with AMD hardware, performance optimization and efficient monitoring of resources is paramount. The AMD System Management Interface command-line tool, amd-smi, addresses these needs.

Read more ...

Introducing the AMD ROCm™ Offline Installer Creator: Simplifying Deployment for AI and HPC

10 September 2024

Document headings start at H2, not H1 [myst.header]

Read more ...