> They do measure and report on this, both in summary in the blog post and in mo...

> They do measure and report on this, both in summary in the blog post and in more detail in the paper.

I didn't see this in the blog post, where is it? Presumably they omitted it from the blog post because the results are bad as I describe below, which is precisely why I cited it as a red flag.

> If you can perfectly accurately extract the state the result would be pretty boring to show right? It'd just be a picture of a board state and next to it the same board state with "these are the same".

But they don't! They have nearly 10% tile level error on the human-data-trained model. That's nearly 100% board error. It's difficult to understand how bad this is but if you visualize it somehow (for example sampling random boards) it becomes obvious that it is really bad. On average about 6 or 7 tiles are going to be wrong. With nearly 100% probability you get an incorrect board.

> If you can extract them, they are encoded in the activations. That's pretty much by definition surely.

No, that's silly. For example you can cycle through every algorithm/transformation imaginable until you hit one that extracts the satanic verses of the Bible. As I said in another comment although it is in theory mitigated somewhat by doing test/validation splits in practice you keep trying different neural network hyperparameters to finesse your validation performance.

> How so? Given a sequence of moves, they can accurately identify which state most of the positions of the board are in just by looking at the network. In order for that to work, the network must be turning a sequence of moves into some representation of a current board state. Assume for the moment they can accurately identify them do you agree with that conclusion?

What conclusion? I believe you can probably train a neural network to take in board moves and output board state with some level of board error. So what?