I use both Sonnet 4.5 and Opus 4.5 to edit lisp (emacs lisp to be be precise) and run into this issue extremely infrequently. Not sure if they have some special handling for this but seems to work ok. I have this problem with Gemini, and less frequently, with Qwen.
I think the problem is our traditional notions of "understanding" and "intelligence" fail us. I don't think we understand what we mean by "understanding". Whatever the LLM is doing inside, it's far removed from what a human would do. But on the face of it, from an external perspective, it has many of the same useful properties as if done by a human. And the LLM's outputs seem to be converging closer and closer to what a human would do, even though there is still a large gap. I suggest the focus here shouldn't be so much on what the LLM can't do but the speed at which it is becoming better at doing things.
I think there is only one thing we should focus on: Measurable capability on tasks. Understanding, memorization, reasoning etc. are all just shorthands we use to quickly convey an idea of a capability on a kind of task. Measurable capability on tasks can also attempt do describe mechanistically how the model works, but that is very difficult. This is where you would try to describe your sense of "understanding" rigorously. To keep it simple for example, I think when you say that the LLM does not understand what you must really mean is that you reckon its performance will quickly decay off as the task gets more difficult in various dimensions: Depth/complexity, Verifiability of the result, length/duration/context size, to a degree where it is still far from being able to act as a labor-delivering agent.
Pretty cool! From having used immersed [1] for a little bit, I really enjoyed having my desktop inside VR. However, the weight of Oculus was just impracticable, could not do an 8 hour shift with it. This sounds much more promising. However, instead of a laptop this should be a stand alone device one could plug in to existing PCs / Laptops methinks...