Posts by Uma Kannikanti

Reproducing the AMD MLPerf Inference v6.0 Submission Result

MLPerf Inference v6.0 marked AMD’s fourth round of submissions to MLPerf Inference. This blog provides a step-by-step guide to reproducing AMD’s results on different vendor systems.

Read more ...


AMD Instinct™ GPUs MLPerf Inference v6.0 Submission

The results for the MLPerf Inference v6.0 benchmark were released on April 1st 2026. In this round, AMD showcased the performance of the MI355X system, as well as the capability and versatility of the ROCm software stack.

Read more ...


Technical Dive into AMD’s MLPerf Inference v5.1 Submission

In the rapidly evolving landscape of artificial intelligence, the demand for reliable and efficient model inference has never been greater. With advancements in large language models (LLMs) and a growing reliance on real-time applications, benchmarks are critical in evaluating how well AI systems perform under varying conditions. Enter MLPerf Inference: Datacenter v5.1 — a significant update to the well-respected benchmarking suite that assesses inference performance across a wide array of models and use cases, catering especially to data centers.

Read more ...


Reproducing the AMD Instinct™ GPUs MLPerf Inference v5.1 Submission

MLPerf Inference v5.1 marks AMD’s third round of submissions and the most ambitious yet. This round features submissions on AMD Instinct MI325X and MI355X systems, including multi-node inference and models in MXFP4 datatype. Building upon the success in MLPerf Inference v5.0, AMD has submitted improved results for Llama 2 70B and SDXL on the MI325X platform in this round using new optimization techniques. For a deeper look at these optimizations, see our Technical Dive into AMD’s MLPerf Inference v5.1 Submission. Additionally, explore how we optimized Llama 3.1 405B through pruning and fine-tuning in Slim Down Your Llama: Pruning & Fine-Tuning for Maximum Performance. In addition, AMD has made submissions for the following workloads:

Read more ...