Posts by Steve Reinhardt
Streamlining Recommendation Model Training on AMD Instinct™ GPUs
- 02 March 2026
Recommendation model training and inference workloads represent a significant portion of computational requirements across industries including e-commerce, social media and content streaming platforms. Unlike LLMs, recommendation models result in to complex and often imbalanced communication across GPUs, along with a higher load on the CPU-GPU interconnect. The ROCm training docker [1] now includes essential libraries for recommendation model training. This blog demonstrates the functionality and ease of training recommendation models using ROCm, along with suggestions for improved configuration of these workloads. We also highlight the inherent benefits of the large HBM size on AMD Instinct™ GPUs for recommendation workloads.
Breaking the Accuracy-Speed Barrier: How MXFP4/6 Quantization Revolutionizes Image and Video Generation
- 07 January 2026
This blog introduces MXFP4 and MXFP6, the newly supported data types on AMD Instinct™ MI350 Series GPUs, and demonstrates their remarkable quality in image and video generation tasks. By reading this blog, you will discover how these low-bit formats can break the accuracy-speed tradeoff, boosting both efficiency and performance in generative AI workflows.