Posts tagged Serving

Enhancing vLLM Inference on AMD GPUs

11 October, 2024 by Clint Greene.

Read more ...


Inferencing and serving with vLLM on AMD GPUs

19 September, 2024 by Clint Greene.

Read more ...


Step-by-Step Guide to Use OpenLLM on AMD GPUs

OpenLLM is an open-source platform designed to facilitate the deployment and utilization of large language models (LLMs), supporting a wide range of models for diverse applications, whether in cloud environments or on-premises. In this tutorial, we will guide you through the process of starting an LLM server using OpenLLM, enabling interaction with the server from your local machine, with special emphasis on leveraging the capabilities of AMD GPUs.

Read more ...