Posts by Yu Wang
AMD Inference Microservice (AIM): Production Ready Inference on AMD Instinct™ GPUs
- 17 November 2025
As generative AI models continue to expand in scale, context length, and operational complexity, enterprises face a harder challenge: how to deploy and operate inference reliably, efficiently, and at production scale. Running LLMs or multimodal models on real workloads requires more than high-performance GPUs. It requires reproducible deployments, predictable performance, seamless orchestration, and an operational framework that teams can trust.
AMD Enterprise AI Suite: Open Infrastructure for Production AI
- 17 November 2025
In this blog, you’ll learn how to operationalize enterprise AI on AMD Instinct™ GPUs using an open, Kubernetes-native software stack. AMD Enterprise AI Suite provides a unified platform that integrates GPU infrastructure, workload orchestration, model inference, and lifecycle governance without dependence on proprietary systems. We begin by outlining the end-to-end architecture and then walk through how each component fits into production workflows: AMD Inference Microservice (AIM) for optimized and scalable model serving, AMD Solution Blueprints for assembling these capabilities into validated, end-to-end AI workflows, AMD Resource Manager for infrastructure administration and multi-team governance, and AMD AI Workbench for reproducible development and fine-tuning environments. Together, these building blocks show how to build, scale, and manage AI workloads across the enterprise using an open, modular, production-ready stack.
AMD Advances Enterprise AI Through OPEA Integration
- 12 March 2025
AMD is excited to support Open Platform for Enterprise AI (OPEA) to simplify and accelerate enterprise AI adoption. With the enablement of OPEA GenAI framework on AMD ROCm™ software stack, businesses and developers can now create scalable, efficient GenAI applications on AMD data center GPUs. Enterprises today face significant challenges when deploying AI at scale, including the complexity of integrating GenAI models, managing GPU resources, ensuring security, and maintaining workflow flexibility. AMD and OPEA aim to address these challenges and streamline AI adoption. This blog will explore the significance of this collaboration, AMD’s contribution to the OPEA project, and demonstrate how to deploy a code translation OPEA GenAI use case on the AMD Instinct™ MI300X GPU.