After all these iterations of Alpha-[blank], [blank]-Zero, now MuZero, etc, I'm ...

codehotter · on Nov 22, 2019

AlphaGo Master, unsurprisingly, was significantly stronger than AlphaGoZero. AlphaZero, although it can play multiple games, was weaker yet. In both cases, they compared the 40 block version of the one with the 20 block version of the other (they had to double the network size to approach the level of the predecessor.)

Recently, Katago has reached similar levels of strength using a small fraction of the resources: https://arxiv.org/abs/1902.10565

It depends on what you mean by "more efficient." The significance of AlphaZero was that you can reach good results in a variety of domains even without human expert knowledge to provide supervised learning data or engineer features. It's efficient in terms of engineering resources.

A precisely tailored approach can always get better results.

IanCal · on Nov 23, 2019

Has it been improved? AlphaZero overtook AlphaGo Master previously https://en.wikipedia.org/wiki/AlphaGo_Zero#Comparison_with_p...

codehotter · on Nov 23, 2019

The 40 block version of AlphaGo Zero is stronger than the 20 block version of AlphaGo Master.

IanCal · on Nov 23, 2019

This is a bit outside of my comfort zone so I'm not sure I quite get what these blocks are. Has any version of alphago master bested alphago zero?

ta_tunestub · on Nov 24, 2019

> which of these "versions" of the project would be the easiest for me to understand/implement?

I have the same question. Not sure I have an answer yet, but this paper includes some pseudocode that implements the algorithm: https://arxiv.org/src/1911.08265v1/anc/pseudocode.py

I'm planning on trying to train something simple like TicTacToe to both see if it works and understand how it works.

malux85 · on Nov 22, 2019

Pick a simple game, so your search space is smaller, and you won’t need 10,000 GPUs to get anything done