Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Leading up to ChatGPT, virtually no model innovation was needed. Just some minor tweaks to activation functions and the order in which operations were completed in the feedfoward layers. So they were essentially trying to trademark the innovation published openly in Attention is All You Need?


Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: