For one, the positioning of the display and camera. The display is elevated so you don't see it directly in front of you. You have to actively look at it. The camera isn't centered in your face so it doesn't line up with your vision.
I agree the display is only part of your vision. That's suboptimal but I'm not sure it's a showstopper.
The camera not being centered on your face is not insurmountable. (In fact, you'd want it to be centered on your right eye, not your face.) The project I linked to above aligns the camera with your vision. It doesn't currently work for different depths of field, but I'm working on that - this relies on your image recognition being able to determine the depth of the recognized object, e.g. by knowing its physical size.