AI Blogs#
Plug-and-Play CuPy on ROCm: Data Analytics Acceleration Made Simple
Learn about how to enhance your analytics project with the latest AMD CuPy release.
Democratizing AI Compute with AMD Using SkyPilot
Learn how SkyPilot integrates with AMD open AI stack to enable seamless multi-cloud deployment and simplifies NVIDIA-to-AMD GPU migration.
Reproducing AMD MLPerf Training v5.1 Submission Result
Learn how to reproduce AMD's MLPerf Training v5.1 submission result.
Technical Dive into AMD MLPerf Training v5.1 Submission
Learn about the technical details of how AMD achieved the results in the MLPerf Training v5.1 submission.
Practical, Fault‑Robust Distributed Inference for DeepSeek on AMD MI300X
Learn how a small-radius expert parallel design with prefill–decode disaggregation enables scalable, fault-isolated LLM inference on AMD Instinct™ MI300X clusters.
Training AI Weather Forecasting Models on AMD Instinct
Learn how deterministic and generative AI models for synoptic-scale weather forecasting are trained efficiently on AMD Instinct MI300X GPUs using the ROCm and GeoArches tools.
Day 0 Developer Guide: hipBLASLt Offline GEMM Tuning Script
Learn how to improve model performance with hipBLASLt offline tuning in our easy-to-use Day 0 tool for developers to optimize GEMM efficiency
Continuing the Momentum: Refining ROCm For The Next Wave Of AI and HPC
ROCm 7.1 builds on 7.0’s AI and HPC advances with faster performance, stronger reliability, and streamlined tools for developers and system builders.
Stability at Scale: AMD’s Full‑Stack Platform for Large‑Model Training
Primus streamlines LLM training on AMD GPUs with unified configs, multi-backend support, preflight validation, and structured logging.
High-Accuracy MXFP4, MXFP6, and Mixed-Precision Models on AMD GPUs
Learn to leverage AMD Quark for efficient MXFP4/MXFP6 quantization on AMD Instinct accelerators with high accuracy retention.
Nitro-E: A 304M Diffusion Transformer Model for High Quality Image Generation
Nitro-E is an extremely lightweight diffusion transformer model for high-quality image generation with only 304M paramters.
STX-B0T: Real-time AI Robot Assistant Powered by RyzenAI and ROCm
STX-B0T explores the potential of RyzenAI PCs to power robotics applications on NPU+GPU. This blog demonstrates how our hardware and software interoperate to unlock real-time perception.