AI Blogs - Page 3#
Fine-Tuning AI Surrogate Models for Physics Simulations with Walrus on AMD Instinct GPU Accelerators
A showcase of fine-tuning the foundational physics simulation model Walrus on a new physics dataset using AMD Instinct hardware.
Ensemble High-Resolution Weather Forecasting on AMD Instinct GPU Accelerators
A discussion on ensembling in weather forecasting, and a guide on how to run forecasting ensembles on AMD GPUs.
HPC Coding Agent - Part 2: An MCP Tool for Code Optimization with OpenEvolve
Learn how to use OpenEvolve as an MCP tool with an AI agent for agentic code optimization
MaxText-Slurm: Production-Grade LLM Training with Built-In Observability
MaxText-Slurm: A unified launch system for production-grade LLM training with observability on AMD GPU clusters.
Streamlining Recommendation Model Training on AMD Instinct™ GPUs
Explore how the ROCm training docker can be used for recommendation model training on Instinct GPUs, along with a guide on configuring the workload.
Exploring Use Cases for Scalable AI: Implementing Ray with ROCm 7 Support for Efficient ML Workflows
Ray with ROCm helps you scale AI applications for training and inference workloads on AMD GPUs.
Getting Started with AMD Resource Manager: Efficient Sharing of AMD Instinct™ GPUs for R&D Teams and AI Practitioners
Learn how to utilize the AMD Resource Manager by following this step-by-step guide on how to setup projects, share compute resources and monitor resource utilization.
JAX-AITER: Bringing AMD’s Optimized AI Kernels to JAX on ROCm™
Use JAX-AITER to run AMD’s AITER-optimized AI kernels from JAX on AMD ROCm, starting with faster multi-head attention and expanding to more ops.
LuminaSFT: Generating Synthetic Fine-Tuning Data for Small Language Models
Learn how task-specific synthetic data can improve small language model performance and explore results from the LuminaSFT study.
PyTorch Offline Tuning with TunableOp
Learn how to accelerate PyTorch workloads with TunableOp offline tuning—record, tune separately, and deploy faster inference.
Primus-Pipeline: A More Flexible and Scalable Pipeline Parallelism Implementation
Learn how to use our flexible and scalable pipeline parallelism framework with Primus backend and AMD hardware.
FlyDSL: Expert GPU Kernel Development with the Ease of MLIR Python Native DSL on AMD GPUs
FlyDSL is a Python-first, MLIR-native DSL for expert GPU kernel development and tuning on AMD GPUs.