Posts by Yu Wang

AMD Inference Microservice (AIM): Production Ready Inference on AMD Instinct™ GPUs

17 November 2025

As generative AI models continue to expand in scale, context length, and operational complexity, enterprises face a harder challenge: how to deploy and operate inference reliably, efficiently, and at production scale. Running LLMs or multimodal models on real workloads requires more than high-performance GPUs. It requires reproducible deployments, predictable performance, seamless orchestration, and an operational framework that teams can trust.

Read more ...

AMD Enterprise AI Suite: Open Infrastructure for Production AI

17 November 2025

In this blog, you’ll learn how to operationalize enterprise AI on AMD Instinct™ GPUs using an open, Kubernetes-native software stack. AMD Enterprise AI Suite provides a unified platform that integrates GPU infrastructure, workload orchestration, model inference, and lifecycle governance without dependence on proprietary systems. We begin by outlining the end-to-end architecture and then walk through how each component fits into production workflows: AMD Inference Microservice (AIM) for optimized and scalable model serving, AMD Solution Blueprints for assembling these capabilities into validated, end-to-end AI workflows, AMD Resource Manager for infrastructure administration and multi-team governance, and AMD AI Workbench for reproducible development and fine-tuning environments. Together, these building blocks show how to build, scale, and manage AI workloads across the enterprise using an open, modular, production-ready stack.

Read more ...

AMD Advances Enterprise AI Through OPEA Integration

12 March 2025

AMD is excited to support Open Platform for Enterprise AI (OPEA) to simplify and accelerate enterprise AI adoption. With the enablement of OPEA GenAI framework on AMD ROCm™ software stack, businesses and developers can now create scalable, efficient GenAI applications on AMD data center GPUs. Enterprises today face significant challenges when deploying AI at scale, including the complexity of integrating GenAI models, managing GPU resources, ensuring security, and maintaining workflow flexibility. AMD and OPEA aim to address these challenges and streamline AI adoption. This blog will explore the significance of this collaboration, AMD’s contribution to the OPEA project, and demonstrate how to deploy a code translation OPEA GenAI use case on the AMD Instinct™ MI300X GPU.

Read more ...