Posts by Mukhil Azhagan Mallaiyan Sathiaseelan
DGL in Depth: SE(3)-Transformer on ROCm 7
- 05 December 2025
In this post, we demonstrate how to run the SE(3)-Transformer efficiently with Deep Graph Library (DGL) on AMD ROCm, enabling high-performance 3D graph learning for complex geometric models. This builds on our previous blog, which highlighted DGL’s versatility across diverse graph neural network (GNN) workloads, validating functionality, compatibility, and usability.
Enabling FlashInfer on ROCm for Accelerated LLM Serving
- 01 October 2025
FlashInfer is an innovative framework designed to accelerate inference of large language models (LLMs). Given the explosive growth and adoption of models like DeepSeek R1, Llama 3, and Qwen 3, efficient inference is critical to meet the demands of real-world deployment. However, challenges such as GPU memory bottlenecks, throughput limitations, and latency remain significant hurdles for deploying these models at scale.
DGL in the Real World: Running GNNs on Real Use Cases
- 20 August 2025
In our previous blog post, we introduced the Deep Graph Library (DGL) and highlighted how its support on the AMD ROCm platform unlocks scalable, performant graph neural networks (GNNs) on AMD GPUs. That post focused on the why — the growing relevance of graph workloads and what it means to bring that capability to AMD’s accelerated computing ecosystem.
Graph Neural Networks at Scale: DGL with ROCm on AMD Hardware
- 31 July 2025
This blog introduces the Deep Graph Library (DGL) and explores its significance on AMD hardware for enabling scalable, performant graph neural networks.