Posts by Pratik Mishra
Elevate Your LLM Inference: Autoscaling with Ray, ROCm 7.0.0, and SkyPilot
- 13 February 2026
This blog explores autoscaling of inference workloads in Ray Serve with a vLLM backend on AMD Instinct™ GPUs for large language models (LLMs). Furthermore, you will learn how to scale beyond a single cluster using SkyPilot, which enables multicloud scaling for Ray Serve. Combined with the AMD ROCm™ software platform, this creates a unified, cloud-agnostic platform that scales distributed LLM inference from single-GPU to multi-cluster deployments.
Democratizing AI Compute with AMD Using SkyPilot
- 13 November 2025
Democratizing AI compute means making advanced infrastructure and tools accessible to everyone—empowering startups, researchers, and developers to train, deploy, and scale AI models without being constrained by proprietary systems or vendor lock-in. The AMD open AI ecosystem, built on AMD ROCm™ Software, pre-built optimized Docker images, and AMD Developer Cloud, provides the foundation for this vision.