Posts by Lihuan Zhang
Primus-Pipeline: A More Flexible and Scalable Pipeline Parallelism Implementation
- 23 February 2026
Error parsing meta tag attribute “keywords”: No content.
MoE Training Best Practices on AMD GPUs
- 16 December 2025
This blog covers best practices for training Mixture-of-Experts (MoE) models on AMD Instinct™ MI300/MI355-series[a] GPUs with the ROCm ecosystem. Whether you’re new to MoE distributed architectures or optimizing trillion-parameter models, this guide will help you identify bottlenecks and maximize efficiency on AMD hardware.