Hacker News new | past | comments | ask | show | jobs | submit login

Transfer Learning during LLM training tends to be 'broader' than that.

Like how

- Training LLMs on code makes them solve reasoning problems better - Training Language Y alongside X makes them much better at Y than if they were trained on language Y alone and so on.

Probably because well gradient descent is a dumb optimizer and training is more like evolution than a human reading a book.

Also, there is something genuinely weird going on with LLM chess. And it's possible base models are better. https://dynomight.net/more-chess/




It seems to be fairly nuanced in how abilities transfer: https://arxiv.org/html/2310.16937v2

Very hard for me to wrap my head around the idea that an LLM being able to discuss, even perhaps teach high level chess strategy wouldn't transfer at all to its playing performance




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: