Surprisingly, storing a game (with all moves) can take less space than encoding a single board. This is because you can effectively encode a move in a single byte (as there are less than 255 moves possible in each position). Applying compression to the resulting binary string will allow you to reduce the space even more.
And shameless plug: Using this encoding, I'm storing millions of games on https://www.chessmonitor.com/ There you can link your Chess.com or Lichess account and view all kind of statistics of your games.
That is a beautifully designed website. I don't think I've ever used an OAuth authentication flow as smooth as the one that your site uses to access my Lichess credentials (same as my HN name, btw, in case you want to beat me at a correspondence game).
Heh, I guess there are board states that are not possible to reach through a valid sequence of moves, but I guess otherwise it's not possible that games are more compressable by definition, since any valid board state could be represented as a sequence of moves.
This does raise the question of the efficiency of reverse engineering a series of minimal moves for some board state.
Are you saying that, to encode a valid position, ignoring the cost to encode both the starting position and the move definitions, you can encode the move sequence using less information than encoding a single position?
I'd expect that ordering moves by popularity at each half move index, using say the above dataset, would allow you to select lower indexed values at each step, allowing a nice context based arithmetic compression to really shrink them well.
Yes, in contrast to storing the board (which can be a fixed size), this depends on the length of the game of course.
That said, there are some optimizations you can apply that will compress moves even further (for example the order of the moves as explained in the Lichess blog post is important). In the end it's a tradeoff between (de)serializing the game fast and reducing the size (even more).
It seems, then, if you wanted to efficiently store lots of positions, it would be smart to store a large list of openings, a large list of following sequences, and then your positions can be stored as a list of sequence ids.
Check out this great blog post of Lichess for more information: https://lichess.org/blog/Wqa7GiAAAOIpBLoY/developer-update-2...
And shameless plug: Using this encoding, I'm storing millions of games on https://www.chessmonitor.com/ There you can link your Chess.com or Lichess account and view all kind of statistics of your games.