Posts by Phillip Dang

CodeGen is a family of standard transformer-based auto-regressive language models for program synthesis, which as defined by the authors as a method for generating computer programs that solve specified problems, using input-output examples or natural language descriptions.

Read more ...

Small language models with Phi-2

08 April 2024

Like many other LLMs, Phi-2 is a transformer-based model with a next-word prediction objective that is trained on billions of tokens. At 2.7 billion parameters, Phi-2 is a relatively small language model, but it achieves outstanding performance on a variety of tasks, including common sense reasoning, language understanding, math, and coding. For reference, GPT 3.5 has 175 billion parameters and the smallest version of LLaMA-2 has 7 billion parameters. According to Microsoft, Phi-2 is capable of matching or outperforming models up to 25 times larger due to more carefully curated training data and model scaling.

Read more ...

Using the ChatGLM-6B bilingual language model with AMD GPUs

04 April 2024

ChatGLM-6B is an open bilingual (Chinese-English) language model with 6.2 billion parameters. It’s optimized for Chinese conversation based on General Language Model (GLM) architecture. GLM is a pretraining framework that seeks to combine the strengths of autoencoder models (like BERT) and autoregressive models (like GPT). The GLM framework randomly blanks out continuous spans of tokens from the input text (autoencoding methodology) and trains the model to sequentially reconstruct the spans (autoregressive pretraining methodology).

Read more ...

Building a decoder transformer model on AMD GPU(s)

12 March 2024

12, Mar 2024 by

.

Read more ...

Question-answering Chatbot with LangChain on an AMD GPU

11 March 2024

11, Mar 2024 by

.

Read more ...

Music Generation With MusicGen on an AMD GPU

08 March 2024

MusicGen is an autoregressive, transformer-based model that predicts the next segment of a piece of music based on previous segments. This is a similar approach to language models predicting the next token.

Read more ...

Simplifying deep learning: A guide to PyTorch Lightning

08 February 2024

8, Feb 2024 by

.

Read more ...