I like the bitboard inversion idea. (Requires flipping logic as well from https://news.ycombinator.com/item?id=37526484, since 32 is a special case.) Note that the en-passantable pawns and castle-able rooks came from options that were unused in my first pass (linked comment). I could use 0x1 as "castleable rook or en-passantable pawn, determined from where it is" and then store 32 integers with 13 options in 119 bits[1], saving 9 bits. But I'm kinda attached to the simplicity here.
(The 16->32 was edited as you wrote this comment.)
(The 16->32 was edited as you wrote this comment.)
[1] log(13^32)/log(2) = 118.4