It's interesting that the standard "K" (number of elements with a shared scale) ...

buildbot · on Oct 20, 2023

There is no mechanism per-say, it's more of a bit space vs quality issue. You could think of MX4 with an 8 bit exponent scale as a 12 bit number if the block size is one, "MX12" with E10M1. You can share the scale with some error per element in a block, with that error going up as you increase the size of the block. As the block size is increased, the effective size per element goes down and the hardware implementation gets smaller/cheaper.