Posts by Saroosh Shabbir

Solution Blueprints: Accelerating AI Deployment with AMD Enterprise AI

AMD Enterprise AI Suite standardizes the inference layer with AMD Inference Microservices (AIMs), a set of containers for optimized model serving on AMD Instinct™ GPUs with validated profiles and OpenAI-compatible APIs. However, production grade agentic and generative AI applications need more than inference endpoints. You need document loaders, embedding pipelines, vector databases, RAG logic, agent orchestration, and user interfaces. These components need to be wired together with proper Kubernetes resource definitions, GPU allocation, service discovery, and configuration management. This blog walks through the technical implementation of Solution Blueprints: how they’re structured, how they use Helm application charts for code reuse, and the patterns they demonstrate for multi-container orchestration. While the Enterprise AI Suite Overview covers the platform and the AIMs blog covers inference, this post focuses on application architecture and deployment patterns.

Read more ...


Retrieval Augmented Generation (RAG) with vLLM, LangChain and Chroma

In this blog from the AMD Silo AI Programs, we build a simple Retrieval‑Augmented Generation (RAG) pipeline. While pretrained models are powerful, they lack access to proprietary or enterprise-specific knowledge. RAG closes that gap by retrieving relevant enterprise knowledge and injecting it into the prompt so the model can produce context‑aware answers. For enterprises, RAG systems offer an efficient way to query their knowledge bases and deliver relevant information to their users.

Read more ...