Posts tagged Speech to Text

Speech-to-Text on an AMD GPU with Whisper

16 April 2024

Whisper is an advanced automatic speech recognition (ASR) system, developed by OpenAI. It employs a straightforward encoder-decoder Transformer architecture where incoming audio is divided into 30-second segments and subsequently fed into the encoder. The decoder can be prompted with special tokens to guide the model to perform tasks such as language identification, transcription, and translation.

Tags
AI/ML
C++
Compiler
Computer Vision
Generative AI
HPC
Inference
Installation
Julia
Kernel
LLM
Linear Algebra
MONAI
MPI
Memory
Mixed Precision
Mixtral
Mixture of Experts
Multimodal
NUMA
Natural Language Processing
NeRF
Neural Collaborative Filtering
OpenMP
Optimization
Partner Applications
Performance
Profiling
Programming Languages
PyTorch
RAG
ResNet
Scientific computing
Segmentation
Serving
Speech to Text
Stable Diffusion
TensorFlow
Tracing