Developers Blogs - Page 2#
Building Robotics Applications with Ryzen AI and ROS 2
This blog post gives a walkthrough of how to deploy a robotics application on the AI PC integrated with ROS - the robot operating system. We showcase Ryzen AI CVML Library to do perception tasks like depth estimation and develop a custom ROS 2 node which allows easy integration with the ROS ecosystem and standard components.
Digital Twins on AMD: Building Robotic Simulations Using Edge AI PCs
Explore how Ryzen AI MAX enables robotic simulation on a single AI PC and take your first step into digital twins.
Resilient Large-Scale Training: Integrating TorchFT with TorchTitan on AMD GPUs
Achieve resilient, checkpoint-less distributed training on AMD GPUs by integrating TorchFT with TorchTitan on Primus-SaFE.
Accelerating Graph Layout with AI and ROCm on AMD GPUs
Case study of using AI coding agents to optimize graph layout using GPUs.
Debugging NaN Results in CK Tile GEMM: A rocgdb Detective Story
Learn GPU kernel debugging with rocgdb through a real case: tracing NaN outputs to a one-character typo in CK Tile GEMM
ROCm 7.2: Smarter, Faster, and More Scalable for Modern AI Workloads
we highlight the latest ROCm 7.2 enhancements for AMD Instinct GPUs, designed to boost AI and HPC performance
ROCm Becomes a First-Class Platform in the vLLM Ecosystem
ROCm is now a first-class vLLM platform: official wheels + Docker, stronger CI, and faster LLM & multimodal inference on AMD Instinct GPUs.
Installing AMD HIP-Enabled GROMACS on HPC Systems: A LUMI Supercomputer Case Study
Installing AMD HIP-Enabled GROMACS on HPC Systems: A LUMI Supercomputer Case Study
Bridging the Last Mile: Deploying Hummingbird-XT for Efficient Video Generation on AMD Consumer-Grade Platforms
Learn how to use Hummingbird-XT and Hummingbird-XTX modelS to generate videos. Explore the video diffusion model acceleration solution, including dit distillation method and light VAE model.
Accelerating Multimodal Inference in vLLM: The One-Line Optimization for Large Multimodal Models
Learn how to optimize multimodal model inference with batch-level data parallelism for vision encoders in vLLM, achieving up to 45% throughput gains on AMD MI300X.
GEAK HIP: Expanding GEAK for HIP Code Optimization
Explore the GEAK frameworks AI-driven HIP code optimization for improved performance on AMD GPUs, including speedup examples and benefits for AI workloads.
A Step-by-Step Walkthrough of Decentralized LLM Training on AMD GPUs
Learn how to train LLMs across decentralized clusters on AMD Instinct MI300 GPUs with DiLoCo and Prime—scale beyond one datacenter.