AI Blogs - Page 12

AI Blogs - Page 12#

October 06, 2025

Optimizing FP4 Mixed-Precision Inference with Petit on AMD Instinct MI250 and MI300 GPUs: A Developer’s Perspective

Learn how FP4 mixed-precision on AMD GPUs boosts inference speed and integrates seamlessly with SGLang.

./artificial-intelligence/fp4-mixed-precision/README.html

October 02, 2025

From Ingestion to Inference: RAG Pipelines on AMD GPUs

Build a RAG enhanced GenAI application that improves the quality of model responses by incorporating data that is missing in the model training data.

./artificial-intelligence/rag-agent/README.html

October 01, 2025

GPU Partitioning Made Easy: Pack More AI Workloads Using AMD GPU Operator

What’s New in AMD GPU Operator: Learn About GPU Partitioning and New Kubernetes Features

./software-tools-optimization/gpu-operator-partitioning/README.html

October 01, 2025

Enabling FlashInfer on ROCm for Accelerated LLM Serving

FlashInfer is an open-source library for accelerating LLM serving that is now supported by ROCm.

./artificial-intelligence/flashinfer/README.html

September 30, 2025

Matrix Core Programming on AMD CDNA™3 and CDNA™4 architecture

This blog post explains how to use Matrix Cores on CDNA3 and CDNA4 architecture, with a focus on low-precision data types such as FP16, FP8, and FP4

./software-tools-optimization/matrix-cores-cdna/README.html

September 30, 2025

Coding Agents on AMD GPUs: Fast LLM Pipelines for Developers

Accelerate AI-assisted coding with agentic workflows on AMD GPUs. Deploy DeepSeek-V3.1 via SGLang, vLLM, or llama.cpp to power fast, scalable coding agents

./artificial-intelligence/coding-agent/README.html

September 25, 2025

Day-0 Support for the SGLang-Native RL Framework - slime on AMD Instinct™ GPUs

Learn how to deploy slime on AMD GPUs for high-performance RL training with ROCm optimization

./artificial-intelligence/slime/README.html

September 24, 2025

Accelerating Audio-Driven Video Generation: WAN2.2-S2V on AMD ROCm

This blog will highlight AMD ROCm’s ability to power next-generation audio-to-video models with simple, reproducible workflows.

./artificial-intelligence/audio-driven-videogen/README.html

September 24, 2025

A Simple Design for Serving Video Generation Models with Distributed Inference

Minimalist FastAPI + Redis + Torchrun design for serving video generation models with distributed inference.

./artificial-intelligence/serving-videogen-v1/README.html

September 19, 2025

An Introduction to Primus-Turbo: A Library for Accelerating Transformer Models on AMD GPUs

Primus streamlines training on AMD ROCm, from fine-tuning to massive pretraining on MI300X GPUs—faster, safer, and easier to debug

./software-tools-optimization/primus-large-models/README.html

September 19, 2025

Optimizing Drug Discovery Tools on AMD MI300X Part 1: Molecular Design with REINVENT

Learn how to set up, run, and optimize REINVENT4, a molecular design tool, on AMD MI300X GPUs for faster drug discovery workflows

./artificial-intelligence/running-reinvent4-amd/README.html

September 18, 2025

Running SOTA AI-based Weather Forecasting models on AMD Instinct

We look at a few State of the Art AI models in weather forecasting, and demonstrate how to run them on AMD Instinct MI300X in a step-by-step fashion.

./artificial-intelligence/ai-weather-forecasting/README.html

Prev Page 12 of 27 Next