Posts by Alexander Aurell

Solution Blueprints: Accelerating AI Deployment with AMD Enterprise AI

AMD Enterprise AI Suite standardizes the inference layer with AMD Inference Microservices (AIMs), a set of containers for optimized model serving on AMD Instinct™ GPUs with validated profiles and OpenAI-compatible APIs. However, production grade agentic and generative AI applications need more than inference endpoints. You need document loaders, embedding pipelines, vector databases, RAG logic, agent orchestration, and user interfaces. These components need to be wired together with proper Kubernetes resource definitions, GPU allocation, service discovery, and configuration management. This blog walks through the technical implementation of Solution Blueprints: how they’re structured, how they use Helm application charts for code reuse, and the patterns they demonstrate for multi-container orchestration. While the Enterprise AI Suite Overview covers the platform and the AIMs blog covers inference, this post focuses on application architecture and deployment patterns.

Read more ...


AMD Inference Microservice (AIM): Production Ready Inference on AMD Instinct™ GPUs

As generative AI models continue to expand in scale, context length, and operational complexity, enterprises face a harder challenge: how to deploy and operate inference reliably, efficiently, and at production scale. Running LLMs or multimodal models on real workloads requires more than high-performance GPUs. It requires reproducible deployments, predictable performance, seamless orchestration, and an operational framework that teams can trust.

Read more ...