Posts by Victor Robles
AI Inference Orchestration with Kubernetes on Instinct MI300X, Part 3
- 13 March 2025
Welcome back to the final part of our series! So far, we’ve successfully setup up a Kubernetes cluster and installed the AMD GPU Operator to seamlessly integrate AMD hardware with Kubernetes in Part 1. We’ve deployed vLLM on AMD Instinct MI300X GPUs, exposed it using MetalLB, and scaled it efficiently in Part 2.
AI Inference Orchestration with Kubernetes on Instinct MI300X, Part 2
- 14 February 2025
Welcome to Part 2 of our series on utilizing Kubernetes with the AMD Instinct platform! If you’re just joining us, we recommend checking out Part 1 where we covered setting up your Kubernetes cluster and enabling AMD GPU support.
AI Inference Orchestration with Kubernetes on Instinct MI300X, Part 1
- 07 February 2025
As organizations scale their AI inference workloads, they face the challenge of efficiently deploying and managing large language models across GPU infrastructure. This three-part blog series provides a production-ready foundation for orchestrating AI inference workloads on the AMD Instinct platform with Kubernetes.