Posts by David Limpus

Productionizing TurboQuant on AMD GPUs for KV-Cache-Bound LLM Inference

^{*The first three authors (Chakrabarti, Limpus, Rana) contributed equally to this work.}