Recent Posts - Page 4#
Utilizing AMD Instinct GPU Accelerators for Weather and Precipitation Forecasting with NeuralGCM
A showcase of how to run NeuralGCM, a hybrid GCM model, on AMD Instinct hardware, including an introduction, installation, inference, and plotting.
Multi-Node Distributed Inference for Diffusion Models with xDiT
Follow a tutorial on multi-node video generation with diffusion models, covering scaling considerations and a practical Docker-based example.
GROMACS Performance on AMD Instinct MI355X
Explore GROMACS molecular dynamics performance benchmarks on AMD Instinct MI355X GPUs with HIP acceleration.
FP8 GEMM Optimization on AMD CDNA™4 Architecture
Learn how to build high-performance FP8 GEMM kernels on AMD CDNA™4 GPUs using MFMA, LDS swizzling, and double-buffering.
Agentic Diagnosis for LLM Training at Scale
Explore how AI agents diagnose LLM training incidents — from RCCL hangs to throughput regressions — in one prompt with MaxText-Slurm.
Getting Started with ComfyUI on AMD Radeon™ RX 9000 Series GPUs
Learn how to set up and optimize ComfyUI on AMD Radeon RX 9000 GPUs with ROCm 7.1 — solve common issues and start generating.
HPC Coding Agent - Part 3: MCP Tool for Profiling
Build an AI agent specialized in optimizing HPC workloads by connecting a Cline agent to expert-level AMD profiling tools via a custom MCP server.
Fine-Tuning AI Surrogate Models for Physics Simulations with Walrus on AMD Instinct GPU Accelerators
A showcase of fine-tuning the foundational physics simulation model Walrus on a new physics dataset using AMD Instinct hardware.
Ensemble High-Resolution Weather Forecasting on AMD Instinct GPU Accelerators
A discussion on ensembling in weather forecasting, and a guide on how to run forecasting ensembles on AMD GPUs.
HPC Coding Agent - Part 2: An MCP Tool for Code Optimization with OpenEvolve
Learn how to use OpenEvolve as an MCP tool with an AI agent for agentic code optimization
MaxText-Slurm: Production-Grade LLM Training with Built-In Observability
MaxText-Slurm: A unified launch system for production-grade LLM training with observability on AMD GPU clusters.
Streamlining Recommendation Model Training on AMD Instinct™ GPUs
Explore how the ROCm training docker can be used for recommendation model training on Instinct GPUs, along with a guide on configuring the workload.