Fair point, thanks for clarification, it seems this was first proposed in https://arxiv.org/pdf/2405.04434? I was confused by your title mentioning DeepSeek but then first paragraph revert to "...language models like ChatGPT and DeepSeek faster at generating text".
Right, that's a good point. I'll adjust the intro a bit.
We wanted to provide a more holistic overview on what MLA is, what came before it, and why it matters :) hope it was useful!
As far as I know, they are the only ones using it so far