Posts by Daniel Gustafsson
Efficient GPU Utilization With Workload Pre-Emption in AMD Resource Manager
- 26 June 2026
GPU capacity is sought after and in high demand. Production inference services, fine-tuning jobs, and developer workspaces like VS Code or JupyterLab all compete for the same resources. The challenge is not just about provisioning enough GPUs, it is keeping them utilized and making sure prioritized work can access capacity when it needs it. Training jobs can drop to near-zero utilization between compute phases; inference services can go quiet between traffic bursts; R&D or experimentation models and development workspaces might be left running unutilized or after hours. This would mean that workloads hold on to GPUs they are no longer using, while other work sits queued.
Adapting AIM LLMs For Specific Use Cases Through Fine-Tuning in AMD AI Workbench
- 03 June 2026
In this blog, you will learn how to fine-tune a pre-trained Large Language Model (LLM) with AMD AI Workbench without writing a single line of code and then deploy it using AMD Inference Microservices (AIMs). Rather than training a model from scratch, fine-tuning allows you to adapt a pre-trained model to your specific use case. In addition, AIMs provide standardized, portable inference microservices for serving AI models. AIMs abstract away the complexities involved in model serving by providing an intelligent orchestration layer that automatically configures runtime environments, detects available accelerators, and selects an optimized performance profile (configuration parameters for the inference engine).
Deploy and Customize AMD Solution Blueprints
- 02 April 2026
AMD Solution Blueprints are ready-to-deploy, customizable reference applications built with AMD Inference Microservices (AIMs). They offer a microservice solution for a range of use cases, from standard chat interfaces to agentic frameworks, serving as both starting points for development and example implementations.
Leveraging AMD AI Workbench and Autoscaling to Scale LLM Inference for Optimal Resource Utilization
- 31 March 2026
Explore how autoscaling with AMD Inference Microservices (AIMs) and AMD AI Workbench can automatically scale your resources in response to shifting AI workload demand. AI inference can be computationally intensive, with resource requirements that vary depending on traffic e.g., the number of inference requests your workload receives at any given time. Autoscaling addresses this by scaling resources up during peak traffic to maintain performance, and scaling them back down during quieter periods to reduce cost and resource consumption.
Getting Started with AMD Resource Manager: Efficient Sharing of AMD Instinct™ GPUs for R&D Teams and AI Practitioners
- 24 February 2026
In this blog, you will learn how to use AMD Resource Manager and its components for centralized AI infrastructure governance. It’s part of the AMD Enterprise AI Suite, a full-stack solution for developing, deploying and running AI workloads on a Kubernetes platform designed to support AMD compute. The AMD Resource Manager provides a user-friendly graphical user interface (GUI) and Command Line Interface (CLI) with a unified control plane that simplifies tasks such as managing compute clusters, user access, monitoring resource utilization, and allocating the right compute quotas to the right projects.
Getting Started with AMD AI Workbench: Deploying and Managing AI Workloads
- 19 December 2025
In this blog, you will learn how to use AMD AI Workbench, an easy to use graphical user interface (GUI) with command-line interface (CLI) support for running and managing AI workloads. It’s part of the AMD Enterprise AI Suite, a full-stack solution for developing, deploying and running AI workloads on a Kubernetes platform designed to support AMD compute. AMD AI Workbench is designed to offer AI developers accelerated end-to-end AI development, from experimentation to production deployment.