Skip to main content
Ctrl+K
AMD Logo
ROCm™ Blogs
  • Home
  • AI
  • HPC
  • Data Science
  • Systems
  • Developers
  • Robotics

ROCm blogs

Posts by Jiahui Cao

FP8 GEMM Optimization on AMD CDNA™4 Architecture

  • 10 March 2026
  • Jiahui Cao , Amanzhol Salykov , Andy Luo
  • English
  • Software tools & optimizations
  • AI/ML C++ Linear Algebra Performance HPC Optimization Hardware

This blog post continues our previous blog Matrix Core Programming on AMD CDNA™3 and CDNA™4 Architecture, which introduced Matrix Cores and demonstrated how to use them in HIP kernels.

Read more ...


  • Terms and Conditions
  • Privacy
  • Trademarks
  • Supply Chain Transparency
  • Fair and Open Competition
  • UK Tax Strategy
  • Cookie Policy
  • Cookie Settings
© 2025 Advanced Micro Devices, Inc