The optical pathway is pretty much locked in. LEEP was invented in the 80s, and that's still the optical system used today. Compare the size to NASA's VR system from the early 90s. https://images.nasa.gov/details/ARC-1992-AC89-0437-6
It's been 30 years of massive improvements to all of the rest of computers, and VR has only shrunk a couple inches. There's not much else we can do to make it smaller.
There’s an insane amount of tech in the Vision Pro. Eyesight probably occupies a big chunk. Then there are more sensors than they need. Also the CPU and 100% processing is happening literally strapped to your face.
This is like having two 5k displays powered by a mobile device*.
* 2 x 5k = 28 million pixels, compared to Vision Pro’s 23 million pixels.
It's been 30 years of massive improvements to all of the rest of computers, and VR has only shrunk a couple inches. There's not much else we can do to make it smaller.