Posts by Yonatan Dukler
Accelerating Mixture-of-Experts Execution with FarSkip-Collective Models
- 05 May 2026
Whether you are running training or inference, the largest Mixture-of-Experts (MoE) based LLMs cannot fit on a single GPU; instead you must run collective-communication operations to integrate the work of multiple GPUs to work together on a single model.