Recent Posts - Page 19#

August 13, 2025

Performance Profiling on AMD GPUs – Part 2: Basic Usage

Part 2 of our GPU profiling series guides beginners through practical steps to identify and optimize kernel bottlenecks using ROCm tools

./software-tools-optimization/profiling-guide/novice/README.html

August 09, 2025

Introducing Instella-Math: Fully Open Language Model with Reasoning Capability

Instella-Math is AMD’s 3B reasoning model, trained on 32 MI300X GPUs with open weights, optimized for logic, math, and chain-of-thought tasks.

./artificial-intelligence/instella-math-language/README.html

August 07, 2025

Running ComfyUI in Windows with ROCm on WSL

Run ComfyUI on Windows with ROCm and WSL to harness Radeon GPU power for local AI tasks like Stable Diffusion—no dual-boot needed

./software-tools-optimization/rocm-on-wsl/README.html

August 05, 2025

Day 0 Developer Guide: Running the Latest Open Models from OpenAI on AMD AI Hardware

Day 0 support across our AI hardware ecosystem from our flagship AMD InstinctTM MI355X and MI300X GPUs, AMD Radeon™ AI PRO R700 GPUs and AMD Ryzen™ AI Processors

./ecosystems-and-partners/openai-day-0/README.html

August 03, 2025

AMD Hummingbird Image to Video: A Lightweight Feedback-Driven Model for Efficient Image-to-Video Generation

We present AMD Hummingbird, offering a two-stage distillation framework for efficient, high-quality text-to-video generation using compact models.

./artificial-intelligence/image-to-video/README.html

August 01, 2025

GEAK: Introducing Triton Kernel AI Agent & Evaluation Benchmarks

AMD introduces GEAK, an AI agent for generating optimized Triton GPU kernels, achieving up to 63% accuracy and up to 2.59× speedups on MI300X GPUs.

./software-tools-optimization/triton-kernel-ai/README.html

July 31, 2025

Accelerating Parallel Programming in Python with Taichi Lang on AMD GPUs

This blog provides a how-to guide on installing and programming with Taichi Lang on AMD Instinct GPUs.

./artificial-intelligence/taichi/README.html

July 31, 2025

Graph Neural Networks at Scale: DGL with ROCm on AMD Hardware

Accelerate Graph Deep Learning on AMD GPUs with DGL and ROCm—scale efficiently with open tools and optimized performance.

./artificial-intelligence/why-graph-neural/README.html

July 25, 2025

Avoiding LDS Bank Conflicts on AMD GPUs Using CK-Tile Framework

This blog shows how CK-Tile’s XOR-based swizzle optimizes shared memory access in GEMM kernels on AMD GPUs by eliminating LDS bank conflicts

./software-tools-optimization/lds-bank-conflict/README.html

July 24, 2025

Benchmarking Reasoning Models: From Tokens to Answers

Learn how to benchmark reasoning tasks. Use Qwen3 and vLLM to test true reasoning performance, not just how fast words are generated.

./artificial-intelligence/benchmark-reasoning-models/README.html

July 21, 2025

Chain-of-Thought Guided Visual Reasoning Using Llama 3.2 on a Single AMD Instinct MI300X GPU

Fine-tune Llama 3.2 Vision models on AMD MI300X GPU using Torchtune, achieving 2.3× better accuracy with 11B vs 90B model on chart-based tasks.

./software-tools-optimization/fine-tune-llama3.2/README.html

July 18, 2025

Introducing ROCm-LS: Accelerating Life Science Workloads with AMD Instinct™ GPUs

Accelerate life science and medical workloads with ROCm-LS, AMDs GPU-optimized toolkit for faster multidimensional image processing and vision.

./software-tools-optimization/rocm-ls-intro/README.html

Prev Page 19 of 36 Next