Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

In what sense is your "experience" (mediated through your senses) more valid than a language model's "experience" of being fed tokens? Token input is just a type of sense, surely?



It's not that I think multimodal input is important. It's that I think goals and experimentation are important. GPT does not try to do things, observe what happened, and draw inferences about how the world works.


> In what sense

In the sense that the chatbox itself behaves as a sensory input to chatgpt.

Chatgpt does not have eyes, tongue, ears, but it does have this "mono-sense" which is its chatbox over which it receives and parses inputs


I would say it's not a question of validity, but of the additional immediate, unambiguous, and visceral (multi sensory) feedback mechanisms to draw from.

If someone is starving and hunting for food, they will learn fast to associate cause and effect of certain actions/situations.

A language model that only works with text may yet have an unambiguous overall loss function to minimize, but as it is a simple scalar, the way it minimizes this loss may be such that it works for the large majority of the training corpus, but falls apart in ambiguous/tricky scenarios.

This may be why LLMs have difficulty in spatial reasoning/navigation for example.

Whatever "reasoning ability" that emerged may have learned _some_ aspects to physicality that it can understand some of these puzzles, but the fact it still makes obvious mistakes sometimes is a curious failure condition.

So it may be that having "more" senses would allow for an LLM to build better models of reality.

For instance, perhaps the LLM has reached a local minima with the probabilistic modelling of text, which is why it still fails probabilistically in answering these sorts of questions.

Introducing unambiguous physical feedback into its "world model" maybe would provide the necessary feedback it needs to help it anchor its reasoning abilities, and stop failing in a probabilistic way LLMs tend to currently do.




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: