Posts by Mukhil Azhagan Mallaiyan Sathiaseelan

DGL in Depth: SE(3)-Transformer on ROCm 7

05 December 2025

In this post, we demonstrate how to run the SE(3)-Transformer efficiently with Deep Graph Library (DGL) on AMD ROCm, enabling high-performance 3D graph learning for complex geometric models. This builds on our previous blog, which highlighted DGL’s versatility across diverse graph neural network (GNN) workloads, validating functionality, compatibility, and usability.

Read more ...

Enabling FlashInfer on ROCm for Accelerated LLM Serving

01 October 2025

FlashInfer is an innovative framework designed to accelerate inference of large language models (LLMs). Given the explosive growth and adoption of models like DeepSeek R1, Llama 3, and Qwen 3, efficient inference is critical to meet the demands of real-world deployment. However, challenges such as GPU memory bottlenecks, throughput limitations, and latency remain significant hurdles for deploying these models at scale.

Read more ...

DGL in the Real World: Running GNNs on Real Use Cases

20 August 2025

In our previous blog post, we introduced the Deep Graph Library (DGL) and highlighted how its support on the AMD ROCm platform unlocks scalable, performant graph neural networks (GNNs) on AMD GPUs. That post focused on the why — the growing relevance of graph workloads and what it means to bring that capability to AMD’s accelerated computing ecosystem.

Read more ...

Graph Neural Networks at Scale: DGL with ROCm on AMD Hardware

31 July 2025

03 October 2025

This blog introduces the Deep Graph Library (DGL) and explores its significance on AMD hardware for enabling scalable, performant graph neural networks.

Read more ...