Transfer Learning during LLM training tends to be 'broader' than that. Like how ... | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

		og_kalu 4 months ago \| parent \| context \| favorite \| on: Recent AI model progress feels mostly like bullshi... Transfer Learning during LLM training tends to be 'broader' than that. Like how - Training LLMs on code makes them solve reasoning problems better - Training Language Y alongside X makes them much better at Y than if they were trained on language Y alone and so on. Probably because well gradient descent is a dumb optimizer and training is more like evolution than a human reading a book. Also, there is something genuinely weird going on with LLM chess. And it's possible base models are better. https://dynomight.net/more-chess/

AstroBen 4 months ago [–]

It seems to be fairly nuanced in how abilities transfer: https://arxiv.org/html/2310.16937v2

Very hard for me to wrap my head around the idea that an LLM being able to discuss, even perhaps teach high level chess strategy wouldn't transfer at all to its playing performance

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact