> My point was that it appears that Tesla doesn't have a large and varied dataset. It has a small and pre-selected data set,
I don't thinks it is a fair characterization: first the notion of "large" is fairly subjective. But more importantly the fact that they collect data on pre-determined triggers is just a guarantee that the dataset is not over-fitted (let say to the 280 and 101 in the Bay Area and to Elon's commute in LA) and instead has good coverage of the world.
Their capabilities of triggering on situation allows them to grow the dataset quickly in a supervised way. It is all about the granularity of these triggers, imagine that you can express "collect situation in tunnels with jerk higher than X m/s3" or "collect all lane change abort in snow condition", ... In the next 24h you get data from all over the world and this data is automatically tagged and classified by the neural net.
I don't thinks it is a fair characterization: first the notion of "large" is fairly subjective. But more importantly the fact that they collect data on pre-determined triggers is just a guarantee that the dataset is not over-fitted (let say to the 280 and 101 in the Bay Area and to Elon's commute in LA) and instead has good coverage of the world.
Their capabilities of triggering on situation allows them to grow the dataset quickly in a supervised way. It is all about the granularity of these triggers, imagine that you can express "collect situation in tunnels with jerk higher than X m/s3" or "collect all lane change abort in snow condition", ... In the next 24h you get data from all over the world and this data is automatically tagged and classified by the neural net.