Posts by Adeem Jassani

TraceLens: Democratizing AI Performance Analysis

27 April 2026

Profiling modern AI workloads produces huge traces that are hard to interpret. Framework profilers record thousands of operations, kernels, and communication events, and engineers often end up staring at tools like Perfetto UI doing manual calculations. TraceLens speeds this up: it consumes existing framework traces and turns them into structured summaries and comparisons, allowing you to move on to the actual diagnosis and optimization.

Read more ...

A Step-by-Step Walkthrough of Decentralized LLM Training on AMD GPUs

18 December 2025

LLMs have shown great capability to generalize new tasks and are at the core of many AI applications recently. Performance of these models has scaled with model size, resulting in training of larger models on massive datasets. This increase in computation requirements has led to training of these LLMs on a huge number of GPUs and poses significant engineering and infrastructure challenges in ensuring standard backpropagation training. They are trained on a strongly interconnected network of devices and hence limit training of the models to a single cluster / data center.

Read more ...