Posts by Andy Luo

Unlock DeepSeek-R1 Inference Performance on AMD Instinct™ MI300X GPU

‘html_meta[‘amd_blog_releasedate’]’ must be of type <class ‘str’> (got None that is a <class ‘NoneType’>). [myst.topmatter]

Read more ...


Best practices for competitive inference optimization on AMD Instinct™ MI300X GPUs

Optimizing LLM performance on GPUs is challenging due to diverse model needs, memory constraints, and balancing latency and throughput. This document examines how hardware utilization, memory and communication bandwidth and scaling, contribute to inference performance, detailing optimal configurations for AMD Instinct™ MI300X GPUs.

Read more ...