Posts by Mukhil Azhagan Mallaiyan Sathiaseelan
Enabling FlashInfer on ROCm for Accelerated LLM Serving
- 01 October 2025
FlashInfer is an innovative framework designed to accelerate inference of large language models (LLMs). Given the explosive growth and adoption of models like DeepSeek R1, Llama 3, and Qwen 3, efficient inference is critical to meet the demands of real-world deployment. However, challenges such as GPU memory bottlenecks, throughput limitations, and latency remain significant hurdles for deploying these models at scale.
DGL in the Real World: Running GNNs on Real Use Cases
- 20 August 2025
In our previous blog post, we introduced the Deep Graph Library (DGL) and highlighted how its support on the AMD ROCm platform unlocks scalable, performant graph neural networks (GNNs) on AMD GPUs. That post focused on the why — the growing relevance of graph workloads and what it means to bring that capability to AMD’s accelerated computing ecosystem.
Graph Neural Networks at Scale: DGL with ROCm on AMD Hardware
- 31 July 2025
This blog introduces the Deep Graph Library (DGL) and explores its significance on AMD hardware for enabling scalable, performant graph neural networks.