Do we know if the 3B model shown in the twitter thread is saturated and we need to train a bigger one, or if it is still converging? 3B parameters seems light for this but I don’t have a good intuition!
(Nit: “Zelda Ocarina of Time” is definitely showing Zelda A Link To The Past sprites, which would make more sense as that is a top down 2d SNES game and Ocarina was a 3d N64 game)
Do we know if the 3B model shown in the twitter thread is saturated and we need to train a bigger one, or if it is still converging? 3B parameters seems light for this but I don’t have a good intuition!
(Nit: “Zelda Ocarina of Time” is definitely showing Zelda A Link To The Past sprites, which would make more sense as that is a top down 2d SNES game and Ocarina was a 3d N64 game)