Posts by Prakamya Mishra

Building a State-of-the-Art 32 Billion Reasoning Model with Only Synthetic Data on AMD GPUs

06 December 2025

Building a state-of-the-art reasoning model is often viewed as a resource-heavy marathon requiring massive compute, months of training, and proprietary datasets. In this blog, we demonstrate how we built a large reasoning model on AMD Instinct™ MI325 GPUs that surpasses the accuracy of the top 32 Billion sized open models on mathematics and science benchmarks, using only synthetic data and with standard Supervised Fine-Tuning (SFT) on top of older-generation models. In line with AMD’s commitment to open source, we are releasing the model weights, detailed training configurations, datasets, and code, enabling the AI community to collaborate, replicate, and innovate, thereby accelerating progress.

Read more ...

LuminaSFT: Generating Synthetic Fine-Tuning Data for Small Language Models

21 November 2025

Small language models (SLMs) are emerging as a lightweight and cost-efficient alternative to large language models (LLMs). They significantly reduce inference costs and latency, and when carefully optimized for specific tasks, can approach—or even match—the performance of larger models. However, due to their limited parameter capacity, SLMs typically require stronger supervision to reach their full potential. Supervised fine-tuning (SFT) therefore plays a crucial role in enhancing their performance.

Read more ...

Introducing Instella-Math: Fully Open Language Model with Reasoning Capability

09 August 2025

AMD is thrilled to introduce Instella-Math, a reasoning-focused language model that marks a major milestone for AMD: as far as we know, it’s the first language model trained with long chain-of-thought reinforcement learning entirely on AMD GPUs. Starting from Instella-3B-Instruct, we extended the model’s capabilities through a multi-stage training pipeline—featuring two stages of supervised fine-tuning and three stages of reinforcement learning using the VERL framework —executed entirely on AMD Instinct™ MI300X GPUs. This blog offers an inside look at the training process and highlights Instella-Math’s performance on challenging reasoning benchmarks, demonstrating the strength of both the model and the hardware behind it.

Read more ...

Introducing Instella-Long: A Fully Open Language Model with Long-Context Capability

11 June 2025

AMD is excited to announce Instella-Long, a long-context language model continually trained from Instella-3B-Instruct on AMD Instinct™ MI300X GPUs. To our knowledge, Instella-Long makes Instella series the first fully open language model trained from scratch that supports long-context. Instella-Long can support 128K context length and achieve competitive performance outperforming open-weights models such as Phi-3.5-mini [1], Gemma-3-4B [2], and Qwen2.5-3B [3] on the long-context benchmark.

Read more ...

Instella-VL-1B: First AMD Vision Language Model

07 March 2025

As part of AMD’s newly released Instella family we are thrilled to introduce Instella-VL-1B, the first AMD vision language model for image understanding trained on AMD Instinct™ MI300X GPUs. Our journey with Instella-VL builds upon our previous 1-billion-parameter language models, AMD OLMo SFT. We further extend the language model’s visual understanding abilities by connecting it with a vision encoder (which is initialized from CLIP ViT-L/14-336). During training, we jointly finetune vision encoder and language model with vision-language data in three stages: Alignment, Pretraining and Supervised-Finetuning (SFT).

Read more ...

Introducing Instella: New State-of-the-art Fully Open 3B Language Models

05 March 2025

AMD is excited to announce Instella, a family of fully open state-of-the-art 3-billion-parameter language models (LMs) trained from scratch on AMD Instinct™ MI300X GPUs. Instella models outperform existing fully open models of similar sizes and achieve competitive performance compared to state-of-the-art open-weight models such as Llama-3.2-3B, Gemma-2-2B, and Qwen-2.5-3B, including their instruction-tuned counterparts.

Read more ...