Posts by Shaghayegh Roohi

Styled Text Image Generation with Eruku on AMD

Producing text images where text is both readable and controllable while faithfully matching a target visual style is a challenging problem. It has broad applications ranging from synthetic handwritten text generation to graphic design. In these settings, you need more than plausible images; you need precise control over both text content and visual fidelity. This is where Eruku[1] stands out.

Read more ...


3D Scene Reconstruction from the Inside: Explore the Mathematics Behind gsplat

3D Gaussian Splatting (3DGS) reconstructs 3D scenes from multiple 2D images and renders novel views in real time. In this blog, which serves as a follow up to a previous post, Elevating 3D Scene Rendering with GSplat, you will learn the core mathematics and the practical library components behind 3DGS using gsplat.

Read more ...


VLM Fine-Tuning for Robotics on AMD Enterprise AI Suite

Vision-language models (VLMs) power applications from image captioning to robotics instruction following, but full model fine-tuning is resource-intensive and slow. Low-Rank Adaptation (LoRA) offers a faster, more efficient alternative by training only a small set of injected parameters while keeping the base model frozen.

Read more ...


Elevating 3D Scene Rendering with GSplat

In this blog we explore how to use GSplat, a GPU-optimized Python library for training and rendering 3DGS models, on AMD devices. This tutorial will guide you through training a model of a scene from a set of captured images, which will then allow you to render novel views of the scene. We use a port of the original GSplat code that has been optimized for AMD GPUs. The examples used throughout this blog were trained and rendered using an AMD MI300X GPU.

Read more ...