Posts by Phillip Dang
Table Question-Answering with TaPas
- 26 April 2024
Conventionally, the question-answering task is framed as a semantic parsing task where the question is translated to a full logical form that can be executed against the table to retrieve the correct answer. However, this requires a lot of annotated data, which can be expensive to acquire.
Multimodal (Visual and Language) understanding with LLaVA-NeXT
- 26 April 2024
LLaVa (Large Language And Vision Assistant) was introduced in 2023 and became a milestone for multimodal models. It combines a pretrained vision encoder and a pretrained LLM for general purpose visual and language understanding. In January 2024, LLaVa-NeXT was released, which boasts significant enhancements, including higher input’s visual resolution and improved logical reasoning and world knowledge.
Text Summarization with FLAN-T5
- 16 April 2024
In this blog, we showcase the language model FLAN-T5 and how to fine-tune it on a summarization task with HuggingFace in an AMD GPUs + ROCm system.
Program Synthesis with CodeGen
- 16 April 2024
CodeGen is a family of standard transformer-based auto-regressive language models for program synthesis, which as defined by the authors as a method for generating computer programs that solve specified problems, using input-output examples or natural language descriptions.
Small language models with Phi-2
- 08 April 2024
Like many other LLMs, Phi-2 is a transformer-based model with a next-word prediction objective that is trained on billions of tokens. At 2.7 billion parameters, Phi-2 is a relatively small language model, but it achieves outstanding performance on a variety of tasks, including common sense reasoning, language understanding, math, and coding. For reference, GPT 3.5 has 175 billion parameters and the smallest version of LLaMA-2 has 7 billion parameters. According to Microsoft, Phi-2 is capable of matching or outperforming models up to 25 times larger due to more carefully curated training data and model scaling.
Using the ChatGLM-6B bilingual language model with AMD GPUs
- 04 April 2024
ChatGLM-6B is an open bilingual (Chinese-English) language model with 6.2 billion parameters. It’s optimized for Chinese conversation based on General Language Model (GLM) architecture. GLM is a pretraining framework that seeks to combine the strengths of autoencoder models (like BERT) and autoregressive models (like GPT). The GLM framework randomly blanks out continuous spans of tokens from the input text (autoencoding methodology) and trains the model to sequentially reconstruct the spans (autoregressive pretraining methodology).
Building a decoder transformer model on AMD GPU(s)
- 12 March 2024
In this blog, we demonstrate how to run Andrej Karpathy’s beautiful PyTorch re-implementation of GPT on single and multiple AMD GPUs on a single node using PyTorch 2.0 and ROCm. We use the works of Shakespeare to train our model, then run inference to see if our model can generate Shakespeare-like text.
Question-answering Chatbot with LangChain on an AMD GPU
- 11 March 2024
LangChain is a framework designed to harness the power of language models for building cutting-edge applications. By connecting language models to various contextual sources and providing reasoning abilities based on the given context, LangChain creates context-aware applications that can intelligently reason and respond. In this blog, we demonstrate how to use LangChain and Hugging Face to create a simple question-answering chatbot. We also demonstrate how to augment our large language model (LLM) knowledge with additional information using the Retrieval Augmented Generation (RAG) technique, then allow our bot to respond to queries based on the information contained within specified documents.
Music Generation With MusicGen on an AMD GPU
- 08 March 2024
MusicGen is an autoregressive, transformer-based model that predicts the next segment of a piece of music based on previous segments. This is a similar approach to language models predicting the next token.
Simplifying deep learning: A guide to PyTorch Lightning
- 08 February 2024
PyTorch Lightning is a higher-level wrapper built on top of PyTorch. Its purpose is to simplify and abstract the process of training PyTorch models. It provides a structured and organized approach to machine learning (ML) tasks by abstracting away the repetitive boilerplate code, allowing you to focus more on model development and experimentation. PyTorch Lightning works out-of-the-box with AMD GPUs and ROCm.