Posts tagged AI/ML

13 November 2024 - SGLang: Fast Serving Framework for Large Language and Vision-Language Models on AMD GPUs

13 November 2024 - Quantized 8-bit LLM training and inference using bitsandbytes on AMD GPUs

01 November 2024 - Distributed Data Parallel Training on AMD GPU with ROCm

24 October 2024 - Torchtune on AMD GPUs How-To Guide: Fine-tuning and Scaling LLMs with Multi-GPU Power

24 October 2024 - CTranslate2: Efficient Inference with Transformer Models on AMD GPUs

23 October 2024 - Inference with Llama 3.2 Vision LLMs on AMD GPUs Using ROCm

15 October 2024 - Speed Up Text Generation with Speculative Sampling on AMD GPUs

15 October 2024 - Multinode Fine-Tuning of Stable Diffusion XL on AMD GPUs with Hugging Face Accelerate and OCI’s Kubernetes Engine (OKE)

11 October 2024 - Enhancing vLLM Inference on AMD GPUs

09 October 2024 - Supercharging JAX with Triton Kernels on AMD GPUs

03 October 2024 - Leaner LLM Inference with INT8 Quantization on AMD GPUs using PyTorch

23 September 2024 - Fine-tuning Llama 3 with Axolotl using ROCm on AMD GPUs

19 September 2024 - Inferencing and serving with vLLM on AMD GPUs

06 September 2024 - Optimize GPT Training: Enabling Mixed Precision Training in JAX using ROCm on AMD GPUs

03 September 2024 - Image Classification with BEiT, MobileNet, and EfficientNet using ROCm on AMD GPUs

28 August 2024 - Benchmarking Machine Learning using ROCm and AMD GPUs: Reproducing Our MLPerf Inference Submission

21 August 2024 - Performing natural language processing tasks with LLMs on ROCm running on AMD GPUs

19 August 2024 - Using AMD GPUs for Enhanced Time Series Forecasting with Transformers

09 August 2024 - Inferencing with Grok-1 on AMD GPUs

29 July 2024 - Optimizing RoBERTa: Fine-Tuning with Mixed Precision on AMD

22 July 2024 - Using statistical methods to reliably compare algorithm performance in large generative AI models with JAX Profiler on AMD GPUs

11 July 2024 - DBRX Instruct on AMD GPUs

11 July 2024 - Accelerate PyTorch Models using torch.compile on AMD GPUs with ROCm

02 July 2024 - A Guide to Implementing and Training Generative Pre-trained Transformers (GPT) in JAX on AMD GPUs

28 June 2024 - Mamba on AMD GPUs with ROCm

27 June 2024 - Fine-tuning and Testing Cutting-Edge Speech Models using ROCm on AMD GPUs

18 June 2024 - TensorFlow Profiler in practice: Optimizing TensorFlow models on AMD GPUs

04 June 2024 - Segment Anything with AMD GPUs

29 May 2024 - Unveiling performance insights with PyTorch Profiler on an AMD GPU

23 May 2024 - Panoptic segmentation and instance segmentation with Detectron2 on AMD GPUs

15 May 2024 - Accelerating Large Language Models with Flash Attention on AMD GPUs

01 May 2024 - Step-by-Step Guide to Use OpenLLM on AMD GPUs

01 May 2024 - Inferencing with Mixtral 8x22B on AMD GPUs

30 April 2024 - Training a Neural Collaborative Filtering (NCF) Recommender on an AMD GPU

26 April 2024 - Table Question-Answering with TaPas

26 April 2024 - Multimodal (Visual and Language) understanding with LLaVA-NeXT

24 April 2024 - Unlocking Vision-Text Dual-Encoding: Multi-GPU Training of a CLIP-Like Model

24 April 2024 - Transforming Words into Motion: A Guide to Video Generation with AMD GPU

17 April 2024 - Inferencing with AI2’s OLMo model on AMD GPU

16 April 2024 - Text Summarization with FLAN-T5

16 April 2024 - Speech-to-Text on an AMD GPU with Whisper

16 April 2024 - PyTorch C++ Extension on AMD GPU

16 April 2024 - Programming AMD GPUs with Julia

16 April 2024 - Program Synthesis with CodeGen

16 April 2024 - Interacting with Contrastive Language-Image Pre-Training (CLIP) model on AMD GPU

16 April 2024 - Instruction fine-tuning of StarCoder with PEFT on multiple AMD GPUs

15 April 2024 - Enhancing LLM Accessibility: A Deep Dive into QLoRA Through Fine-tuning Llama Model on a single AMD GPU

15 April 2024 - Enhancing LLM Accessibility: A Deep Dive into QLoRA Through Fine-tuning Llama 2 on a single AMD GPU

15 April 2024 - Developing Triton Kernels on AMD GPUs

11 April 2024 - GPU Unleashed: Training Reinforcement Learning Agents with Stable Baselines3 on an AMD GPU in Gymnasium Environment

09 April 2024 - ResNet for image classification using AMD GPUs

08 April 2024 - Small language models with Phi-2

04 April 2024 - Using the ChatGLM-6B bilingual language model with AMD GPUs

04 April 2024 - Total body segmentation using MONAI Deploy on an AMD GPU

04 April 2024 - Retrieval Augmented Generation (RAG) using LlamaIndex

04 April 2024 - Image classification using Vision Transformer with AMD GPUs

04 April 2024 - Building semantic search with SentenceTransformers on AMD

01 April 2024 - Scale AI applications with Ray

29 March 2024 - Automatic mixed precision in PyTorch using AMD GPUs

15 March 2024 - Large language model inference optimizations on AMD GPUs

12 March 2024 - Building a decoder transformer model on AMD GPU(s)

11 March 2024 - Question-answering Chatbot with LangChain on an AMD GPU

08 March 2024 - Music Generation With MusicGen on an AMD GPU

23 February 2024 - Efficient image generation with Stable Diffusion models and ONNX Runtime using AMD GPUs

08 February 2024 - Simplifying deep learning: A guide to PyTorch Lightning

07 February 2024 - Two-dimensional images to three-dimensional scene mapping using NeRF on an AMD GPU

05 February 2024 - Using LoRA for efficient fine-tuning: Fundamental principles

01 February 2024 - Fine-tune Llama model with LoRA: Customizing a large language model for question-answering

01 February 2024 - Fine-tune Llama 2 with LoRA: Customizing a large language model for question-answering

29 January 2024 - Pre-training BERT using Hugging Face & TensorFlow on an AMD GPU

26 January 2024 - Pre-training BERT using Hugging Face & PyTorch on an AMD GPU

26 January 2024 - Accelerating XGBoost with Dask using multiple AMD GPUs

25 January 2024 - LLM distributed supervised fine-tuning with JAX

24 January 2024 - Pre-training a large language model with Megatron-DeepSpeed on multiple AMD GPUs

24 January 2024 - Efficient image generation with Stable Diffusion models and AITemplate using AMD GPUs

24 January 2024 - Efficient deployment of large language models with Text Generation Inference on AMD GPUs

11 September 2023 - Creating a PyTorch/TensorFlow code environment on AMD GPUs

Posts tagged Fine-Tuning

13 November 2024 - Quantized 8-bit LLM training and inference using bitsandbytes on AMD GPUs

23 October 2024 - Inference with Llama 3.2 Vision LLMs on AMD GPUs Using ROCm

15 October 2024 - Multinode Fine-Tuning of Stable Diffusion XL on AMD GPUs with Hugging Face Accelerate and OCI’s Kubernetes Engine (OKE)

26 April 2024 - Table Question-Answering with TaPas

26 April 2024 - Multimodal (Visual and Language) understanding with LLaVA-NeXT

24 April 2024 - Unlocking Vision-Text Dual-Encoding: Multi-GPU Training of a CLIP-Like Model

16 April 2024 - Text Summarization with FLAN-T5

16 April 2024 - Instruction fine-tuning of StarCoder with PEFT on multiple AMD GPUs

15 April 2024 - Enhancing LLM Accessibility: A Deep Dive into QLoRA Through Fine-tuning Llama Model on a single AMD GPU

15 April 2024 - Enhancing LLM Accessibility: A Deep Dive into QLoRA Through Fine-tuning Llama 2 on a single AMD GPU

08 April 2024 - Small language models with Phi-2

01 April 2024 - Scale AI applications with Ray

15 March 2024 - Large language model inference optimizations on AMD GPUs

12 March 2024 - Building a decoder transformer model on AMD GPU(s)

11 March 2024 - Question-answering Chatbot with LangChain on an AMD GPU

08 March 2024 - Music Generation With MusicGen on an AMD GPU

08 February 2024 - Simplifying deep learning: A guide to PyTorch Lightning

05 February 2024 - Using LoRA for efficient fine-tuning: Fundamental principles

01 February 2024 - Fine-tune Llama model with LoRA: Customizing a large language model for question-answering

01 February 2024 - Fine-tune Llama 2 with LoRA: Customizing a large language model for question-answering

29 January 2024 - Pre-training BERT using Hugging Face & TensorFlow on an AMD GPU

26 January 2024 - Pre-training BERT using Hugging Face & PyTorch on an AMD GPU

25 January 2024 - LLM distributed supervised fine-tuning with JAX

24 January 2024 - Pre-training a large language model with Megatron-DeepSpeed on multiple AMD GPUs

Posts tagged GenAI

13 November 2024 - SGLang: Fast Serving Framework for Large Language and Vision-Language Models on AMD GPUs

01 November 2024 - Distributed Data Parallel Training on AMD GPU with ROCm

24 October 2024 - CTranslate2: Efficient Inference with Transformer Models on AMD GPUs

23 October 2024 - Inference with Llama 3.2 Vision LLMs on AMD GPUs Using ROCm

15 October 2024 - Speed Up Text Generation with Speculative Sampling on AMD GPUs

15 October 2024 - Multinode Fine-Tuning of Stable Diffusion XL on AMD GPUs with Hugging Face Accelerate and OCI’s Kubernetes Engine (OKE)

03 October 2024 - Leaner LLM Inference with INT8 Quantization on AMD GPUs using PyTorch

06 September 2024 - Optimize GPT Training: Enabling Mixed Precision Training in JAX using ROCm on AMD GPUs

11 July 2024 - Accelerate PyTorch Models using torch.compile on AMD GPUs with ROCm

03 July 2024 - Accelerating models on ROCm using PyTorch TunableOp

02 July 2024 - A Guide to Implementing and Training Generative Pre-trained Transformers (GPT) in JAX on AMD GPUs

28 June 2024 - Mamba on AMD GPUs with ROCm

24 April 2024 - Unlocking Vision-Text Dual-Encoding: Multi-GPU Training of a CLIP-Like Model

24 April 2024 - Transforming Words into Motion: A Guide to Video Generation with AMD GPU

17 April 2024 - Inferencing with AI2’s OLMo model on AMD GPU

16 April 2024 - Program Synthesis with CodeGen

16 April 2024 - Interacting with Contrastive Language-Image Pre-Training (CLIP) model on AMD GPU

16 April 2024 - Instruction fine-tuning of StarCoder with PEFT on multiple AMD GPUs

15 April 2024 - Enhancing LLM Accessibility: A Deep Dive into QLoRA Through Fine-tuning Llama Model on a single AMD GPU

15 April 2024 - Enhancing LLM Accessibility: A Deep Dive into QLoRA Through Fine-tuning Llama 2 on a single AMD GPU

04 April 2024 - Image classification using Vision Transformer with AMD GPUs

04 April 2024 - Building semantic search with SentenceTransformers on AMD

01 April 2024 - Scale AI applications with Ray

15 March 2024 - Large language model inference optimizations on AMD GPUs

08 March 2024 - Music Generation With MusicGen on an AMD GPU

23 February 2024 - Efficient image generation with Stable Diffusion models and ONNX Runtime using AMD GPUs

07 February 2024 - Two-dimensional images to three-dimensional scene mapping using NeRF on an AMD GPU

05 February 2024 - Using LoRA for efficient fine-tuning: Fundamental principles

01 February 2024 - Fine-tune Llama model with LoRA: Customizing a large language model for question-answering

01 February 2024 - Fine-tune Llama 2 with LoRA: Customizing a large language model for question-answering

29 January 2024 - Pre-training BERT using Hugging Face & TensorFlow on an AMD GPU

26 January 2024 - Pre-training BERT using Hugging Face & PyTorch on an AMD GPU

25 January 2024 - LLM distributed supervised fine-tuning with JAX

24 January 2024 - Efficient image generation with Stable Diffusion models and AITemplate using AMD GPUs

24 January 2024 - Efficient deployment of large language models with Text Generation Inference on AMD GPUs

Posts tagged HPC

13 November 2024 - Introducing AMD’s Next-Gen Fortran Compiler

17 September 2024 - Getting to Know Your GPU: A Deep Dive into AMD SMI

10 September 2024 - Introducing the AMD ROCm™ Offline Installer Creator: Simplifying Deployment for AI and HPC

29 August 2024 - Seismic stencil codes - part 3

29 August 2024 - Seismic stencil codes - part 2

29 August 2024 - Seismic stencil codes - part 1

29 July 2024 - Graph analytics on AMD GPUs using Gunrock

18 June 2024 - TensorFlow Profiler in practice: Optimizing TensorFlow models on AMD GPUs

13 May 2024 - Reading AMD GPU ISA

07 May 2024 - AMD in Action: Unveiling the Power of Application Tracing and Profiling

26 April 2024 - Application portability with HIP

18 April 2024 - C++17 parallel algorithms and HIPSTDPAR

16 April 2024 - Programming AMD GPUs with Julia

16 April 2024 - Affinity part 2 - System topology and controlling affinity

16 April 2024 - Affinity part 1 - Affinity, placement, and order

03 November 2023 - Sparse matrix vector multiplication - part 1

15 September 2023 - Jacobi Solver with HIP and OpenMP offloading

18 July 2023 - Finite difference method - Laplacian part 4

08 June 2023 - GPU-aware MPI with ROCm

17 May 2023 - Register pressure in AMD CDNA™2 GPUs

11 May 2023 - Finite difference method - Laplacian part 3

12 April 2023 - Introduction to profiling tools for AMD hardware

09 March 2023 - AMD Instinct™ MI200 GPU memory space overview

26 January 2023 - AMD ROCm™ installation

04 January 2023 - Finite difference method - Laplacian part 2

14 November 2022 - Finite difference method - Laplacian part 1

14 November 2022 - AMD matrix cores

Posts tagged LLM

13 November 2024 - SGLang: Fast Serving Framework for Large Language and Vision-Language Models on AMD GPUs

13 November 2024 - Quantized 8-bit LLM training and inference using bitsandbytes on AMD GPUs

01 November 2024 - Distributed Data Parallel Training on AMD GPU with ROCm

24 October 2024 - Torchtune on AMD GPUs How-To Guide: Fine-tuning and Scaling LLMs with Multi-GPU Power

24 October 2024 - CTranslate2: Efficient Inference with Transformer Models on AMD GPUs

23 October 2024 - Inference with Llama 3.2 Vision LLMs on AMD GPUs Using ROCm

11 October 2024 - Enhancing vLLM Inference on AMD GPUs

09 October 2024 - Supercharging JAX with Triton Kernels on AMD GPUs

03 October 2024 - Leaner LLM Inference with INT8 Quantization on AMD GPUs using PyTorch

23 September 2024 - Fine-tuning Llama 3 with Axolotl using ROCm on AMD GPUs

19 September 2024 - Inferencing and serving with vLLM on AMD GPUs

06 September 2024 - Optimize GPT Training: Enabling Mixed Precision Training in JAX using ROCm on AMD GPUs

28 August 2024 - Benchmarking Machine Learning using ROCm and AMD GPUs: Reproducing Our MLPerf Inference Submission

21 August 2024 - Performing natural language processing tasks with LLMs on ROCm running on AMD GPUs

19 August 2024 - Using AMD GPUs for Enhanced Time Series Forecasting with Transformers

09 August 2024 - Inferencing with Grok-1 on AMD GPUs

29 July 2024 - Optimizing RoBERTa: Fine-Tuning with Mixed Precision on AMD

11 July 2024 - Accelerate PyTorch Models using torch.compile on AMD GPUs with ROCm

03 July 2024 - Accelerating models on ROCm using PyTorch TunableOp

02 July 2024 - A Guide to Implementing and Training Generative Pre-trained Transformers (GPT) in JAX on AMD GPUs

28 June 2024 - Mamba on AMD GPUs with ROCm

27 June 2024 - Fine-tuning and Testing Cutting-Edge Speech Models using ROCm on AMD GPUs

31 May 2024 - SmoothQuant model inference on AMD Instinct MI300X using Composable Kernel

15 May 2024 - Accelerating Large Language Models with Flash Attention on AMD GPUs

01 May 2024 - Step-by-Step Guide to Use OpenLLM on AMD GPUs

01 May 2024 - Inferencing with Mixtral 8x22B on AMD GPUs

26 April 2024 - Table Question-Answering with TaPas

26 April 2024 - Multimodal (Visual and Language) understanding with LLaVA-NeXT

24 April 2024 - Unlocking Vision-Text Dual-Encoding: Multi-GPU Training of a CLIP-Like Model

17 April 2024 - Inferencing with AI2’s OLMo model on AMD GPU

15 April 2024 - Enhancing LLM Accessibility: A Deep Dive into QLoRA Through Fine-tuning Llama Model on a single AMD GPU

15 April 2024 - Enhancing LLM Accessibility: A Deep Dive into QLoRA Through Fine-tuning Llama 2 on a single AMD GPU

04 April 2024 - Using the ChatGLM-6B bilingual language model with AMD GPUs

04 April 2024 - Retrieval Augmented Generation (RAG) using LlamaIndex

04 April 2024 - Building semantic search with SentenceTransformers on AMD

01 April 2024 - Scale AI applications with Ray

15 March 2024 - Large language model inference optimizations on AMD GPUs

05 February 2024 - Using LoRA for efficient fine-tuning: Fundamental principles

01 February 2024 - Fine-tune Llama model with LoRA: Customizing a large language model for question-answering

01 February 2024 - Fine-tune Llama 2 with LoRA: Customizing a large language model for question-answering

29 January 2024 - Pre-training BERT using Hugging Face & TensorFlow on an AMD GPU

26 January 2024 - Pre-training BERT using Hugging Face & PyTorch on an AMD GPU

26 January 2024 - Accelerating XGBoost with Dask using multiple AMD GPUs

25 January 2024 - LLM distributed supervised fine-tuning with JAX

24 January 2024 - Pre-training a large language model with Megatron-DeepSpeed on multiple AMD GPUs

24 January 2024 - Efficient deployment of large language models with Text Generation Inference on AMD GPUs

Posts tagged PyTorch

13 November 2024 - SGLang: Fast Serving Framework for Large Language and Vision-Language Models on AMD GPUs

13 November 2024 - Quantized 8-bit LLM training and inference using bitsandbytes on AMD GPUs

01 November 2024 - Distributed Data Parallel Training on AMD GPU with ROCm

24 October 2024 - Torchtune on AMD GPUs How-To Guide: Fine-tuning and Scaling LLMs with Multi-GPU Power

24 October 2024 - CTranslate2: Efficient Inference with Transformer Models on AMD GPUs

15 October 2024 - Speed Up Text Generation with Speculative Sampling on AMD GPUs

03 October 2024 - Leaner LLM Inference with INT8 Quantization on AMD GPUs using PyTorch

23 September 2024 - Fine-tuning Llama 3 with Axolotl using ROCm on AMD GPUs

03 September 2024 - Image Classification with BEiT, MobileNet, and EfficientNet using ROCm on AMD GPUs

19 August 2024 - Using AMD GPUs for Enhanced Time Series Forecasting with Transformers

29 July 2024 - Optimizing RoBERTa: Fine-Tuning with Mixed Precision on AMD

11 July 2024 - DBRX Instruct on AMD GPUs

11 July 2024 - Accelerate PyTorch Models using torch.compile on AMD GPUs with ROCm

03 July 2024 - Accelerating models on ROCm using PyTorch TunableOp

02 July 2024 - A Guide to Implementing and Training Generative Pre-trained Transformers (GPT) in JAX on AMD GPUs

28 June 2024 - Mamba on AMD GPUs with ROCm

28 June 2024 - Deep Learning Recommendation Models on AMD GPUs

27 June 2024 - Fine-tuning and Testing Cutting-Edge Speech Models using ROCm on AMD GPUs

29 May 2024 - Unveiling performance insights with PyTorch Profiler on an AMD GPU

23 May 2024 - Panoptic segmentation and instance segmentation with Detectron2 on AMD GPUs

15 May 2024 - Accelerating Large Language Models with Flash Attention on AMD GPUs

26 April 2024 - Table Question-Answering with TaPas

26 April 2024 - Multimodal (Visual and Language) understanding with LLaVA-NeXT

24 April 2024 - Unlocking Vision-Text Dual-Encoding: Multi-GPU Training of a CLIP-Like Model

24 April 2024 - Transforming Words into Motion: A Guide to Video Generation with AMD GPU

17 April 2024 - Inferencing with AI2’s OLMo model on AMD GPU

16 April 2024 - Text Summarization with FLAN-T5

16 April 2024 - PyTorch C++ Extension on AMD GPU

16 April 2024 - Program Synthesis with CodeGen

16 April 2024 - Instruction fine-tuning of StarCoder with PEFT on multiple AMD GPUs

11 April 2024 - GPU Unleashed: Training Reinforcement Learning Agents with Stable Baselines3 on an AMD GPU in Gymnasium Environment

09 April 2024 - ResNet for image classification using AMD GPUs

08 April 2024 - Small language models with Phi-2

04 April 2024 - Using the ChatGLM-6B bilingual language model with AMD GPUs

04 April 2024 - Total body segmentation using MONAI Deploy on an AMD GPU

29 March 2024 - Automatic mixed precision in PyTorch using AMD GPUs

12 March 2024 - Building a decoder transformer model on AMD GPU(s)

11 March 2024 - Question-answering Chatbot with LangChain on an AMD GPU

08 March 2024 - Music Generation With MusicGen on an AMD GPU

23 February 2024 - Efficient image generation with Stable Diffusion models and ONNX Runtime using AMD GPUs

08 February 2024 - Simplifying deep learning: A guide to PyTorch Lightning

07 February 2024 - Two-dimensional images to three-dimensional scene mapping using NeRF on an AMD GPU

05 February 2024 - Using LoRA for efficient fine-tuning: Fundamental principles

26 January 2024 - Pre-training BERT using Hugging Face & PyTorch on an AMD GPU

24 January 2024 - Pre-training a large language model with Megatron-DeepSpeed on multiple AMD GPUs

11 September 2023 - Creating a PyTorch/TensorFlow code environment on AMD GPUs