Putting it together is not as simple as it seems. I think it was an immense engi...

Putting it together is not as simple as it seems. I think it was an immense engineering and design effort from Apple to get it to the point where it feels effortless and obvious

Not only do they have two cameras per eye, and all the hardware for wide angle out-of-view hand tracking, they had to consider:

Privacy: the user’s gaze is never delivered to your process when your native UI reacts to their gaze. Building this infrastructure to be performant, bug free and secure is a lot of work. Not to mention making it completely transparent for developers to use

Design: they reconsidered every single iOS control in the context of gaze and pinch, and invented whole new UI paradigms that work really well with the existing SDK. You can insert 3D models into a SwiftUI scroll view, and scroll them, and it just works (they even fade at the cut off point)

Accessibility: there is a great deal of thought put into alternative navigation methods for users who cannot maintain consistent gaze

In addition to this they clearly thought about how to maintain “gazeable” targets in the UI. When you drag a window closer or farther it scales up and down maintaining exactly the same visual size, trying to ensure nothing gets too small or large to gaze at effectively

There are so many thousands of design and engineering decisions that went into making gaze and pinch based navigation work so simply, so I can understand how it hasn’t been done this effectively until now