Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

AlphaGo and AlphaStar both started out based on human training and then played against versions of themselves to go on and create new strategies in their games. Modern LLMs can't learn/experiment as far as I know in exactly the same way but that may not always be true.


Yeah, but they had a limited set of rules to work within (they were just hyper-efficient at calculating the possible outcomes relative to those rules). Humans, in theory, only have the rules they believe as there technically are no rules (it's all make-believe). For example, what was the "rule" that told people to make a wheel? There wasn't one. The human had to think about it/conceive it, which AI can't (and I'd argue never will be able to) without rules.


Reinforcement learning is a completely different strategy compared to how most LLMs work.




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: