I would wager that doesn't make up for how much slower the hardware is, though.
The 4MHz value is the clock speed, but all instructions take a multiple of 4 clock cycles to accomplish, so you often talk about it as a 1MHz "instruction cycle" machine.
The fastest memcpy implementation I've seen for it (not including in-hardware DMA features, which are only available for copying to OAM (sprite) RAM on the original gameboy) takes 4.5 cycles per byte (one 16-bit word every 9 cycles).
That means that if you were trying to update every pixel on the screen (which doesn't actually work - the GB's screen pixels aren't memory mapped directly but generally specified via a combination of up to 384 unique 8x8 tiles, which are then arranged on a 20x18 tile grid), which would be 160 * 144 * 2 bits/pixel = 5760 bytes, it would take 25920 instruction cycles ~= 24.72ms, which comes out to approx 40.45 FPS.
Of course, as I mentioned, the actual GB doesn't work like that, and has a Pixel Processing Unit that handles writing to screen every frame. While the PPU is writing to the screen, the CPU cannot access VRAM, which means that the CPU actually only gets what's called the v-blank period, which is only 1140 instruction cycles, which is only enough time to copy up to 254 bytes or so (I'm skipping over some detail here, like how you could squeeze a few extra bytes out by loading the first 4 or so bytes into registers before vblank begins, or how you can also (with very careful timing) write to VRAM during a ~60 cycle window on each line knowing as the h-blank and OAM search periods).
In practice, games work despite these limitations because they generally only have to modify a small amount of the screen in any one frame. Scrolling is implemented in hardware, and you can write new data off screen then scroll into it so you can take multiple frames to write it if needed.
This plus other tricks (for example, animating 50 copies of a common tile by editing the tile's pixel data, which affects every copy at once) let these games accomplish so much with so little.
It's impossible to compare to a modern sytsem with a polygon-based GPU, they're entirely different things.
The 4MHz value is the clock speed, but all instructions take a multiple of 4 clock cycles to accomplish, so you often talk about it as a 1MHz "instruction cycle" machine.
The fastest memcpy implementation I've seen for it (not including in-hardware DMA features, which are only available for copying to OAM (sprite) RAM on the original gameboy) takes 4.5 cycles per byte (one 16-bit word every 9 cycles).
That means that if you were trying to update every pixel on the screen (which doesn't actually work - the GB's screen pixels aren't memory mapped directly but generally specified via a combination of up to 384 unique 8x8 tiles, which are then arranged on a 20x18 tile grid), which would be 160 * 144 * 2 bits/pixel = 5760 bytes, it would take 25920 instruction cycles ~= 24.72ms, which comes out to approx 40.45 FPS.
Of course, as I mentioned, the actual GB doesn't work like that, and has a Pixel Processing Unit that handles writing to screen every frame. While the PPU is writing to the screen, the CPU cannot access VRAM, which means that the CPU actually only gets what's called the v-blank period, which is only 1140 instruction cycles, which is only enough time to copy up to 254 bytes or so (I'm skipping over some detail here, like how you could squeeze a few extra bytes out by loading the first 4 or so bytes into registers before vblank begins, or how you can also (with very careful timing) write to VRAM during a ~60 cycle window on each line knowing as the h-blank and OAM search periods).
In practice, games work despite these limitations because they generally only have to modify a small amount of the screen in any one frame. Scrolling is implemented in hardware, and you can write new data off screen then scroll into it so you can take multiple frames to write it if needed.
This plus other tricks (for example, animating 50 copies of a common tile by editing the tile's pixel data, which affects every copy at once) let these games accomplish so much with so little.
It's impossible to compare to a modern sytsem with a polygon-based GPU, they're entirely different things.