It fails at _some_ arithmetic. Humans also fail at arithmetic...
In any case, is that the defining characteristic of having a good enough "world model"? What distinguishes your ability understand the world vs. an LLM? From my perspective, you would prove it by explaining it to me, in much the same way an LLM could.