Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

My point was that it appears that Tesla doesn't have a large and varied dataset. It has a small and pre-selected data set, since the cars only transmit data when pre-determined triggers are fired. Thus, it doesn't matter how many 1000s of drivers Tesla "has" or how many "situations" they're in, since it's not actually collecting data from most of these situations.

And Autpilot's performance (including its numerous regressions) suggests very strongly either that it doesn't have a very large data set, or else that it has a large data set of everyone doing roughly the same thing almost all of the time. These are the two most logical explanations for Autopilot's tendency to veer toward freeway dividers even (especially?) after updates.



> My point was that it appears that Tesla doesn't have a large and varied dataset. It has a small and pre-selected data set,

I don't thinks it is a fair characterization: first the notion of "large" is fairly subjective. But more importantly the fact that they collect data on pre-determined triggers is just a guarantee that the dataset is not over-fitted (let say to the 280 and 101 in the Bay Area and to Elon's commute in LA) and instead has good coverage of the world.

Their capabilities of triggering on situation allows them to grow the dataset quickly in a supervised way. It is all about the granularity of these triggers, imagine that you can express "collect situation in tunnels with jerk higher than X m/s3" or "collect all lane change abort in snow condition", ... In the next 24h you get data from all over the world and this data is automatically tagged and classified by the neural net.


Shadow mode would also transmit data no?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: