Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

>Models don't need to have been trained on every single possibility - it's possible for them to generalize and interpolate/extrapolate.

They do have some in-distribution generalisation capabilities, but human intentions are not a generalisation of visual information.



"human intentions are not a generalisation of visual information" is a bit confusing category-wise. Question would be to what extent you can predict someone's next action, like running out to retrieve a ball, given just what a human driver can sense.

Clearly that's possible to some extent, and in theory it should be possible for some system receiving the same inputs to reach human-level performance on the task, but it seems very challenging given the imposed constraints.

Also, for clarity, note that the limitations don't require the model be trained only on driver-view data. It may be that reasoning capability is better learned through text pretraining for instance.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: