Hacker News new | past | comments | ask | show | jobs | submit login
Can you explain Transformers in one sentence? (attention mechanisms) [video] (youtube.com)
13 points by chii on Sept 11, 2023 | hide | past | favorite | 3 comments



I always thought the query/key/value analogy was confusing and unnecessary. That tired analogy is why I don’t think Attention is All You Need is a particularly good paper. The BERT paper is much more readable.

If you actually look at what a self attention head looks like it’s much easier to understand and really not that complicated.

Once you get self attention, multi headed attention is just doing that N times in parallel over the same sequence.


Transformers are robots that turn into cars and vice versa.


What about megatron and sound wave… and in fact most of the decepticons

Transformers: Robots in disguise!




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: