Posts tagged AI
Debugging NaN Results in CK Tile GEMM: A rocgdb Detective Story
- 30 January 2026
When developing high-performance GPU kernels, subtle bugs can lead to catastrophic failures like NaN (Not-a-Number) outputs. This post chronicles our journey of debugging a tricky NaN issue in AMD’s Composable Kernel (CK) Tile GEMM implementation using rocgdb. What started as mysterious NaN outputs ended with discovering a single-character typo that corrupted the data distribution.