Hacker News new | past | comments | ask | show | jobs | submit login

I'm not "in the field" though I like to read about and use LLMs. This video "How DeepSeek Rewrote the Transformer [MLA]"[0] is really good at explaining MHA, MQA, GQA, and MLA with clear visuals/animations and how DeepSeek MLA is 57x more efficient.

[0] https://www.youtube.com/watch?v=0VLAoVGf_74&t=960s






Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: