Engineers will move around. You can't lock up knowledge that's in people's heads...

_ugfj · on Aug 28, 2018

The moat is the dataset. Waymo could hand out their hardware-software like candy and still be way, way ahead.

tedsanders · on Aug 28, 2018

I'm not so sure. How strong of a moat is the data, really?

Waymo leads right now with 7 million miles driven.

If you wanted to drive that many miles in the next year, how much would it cost?

I estimate it would cost maybe $60M or so:

-$30M on cars (assuming a fleet of 200 cars driving 100 miles a day, costing $150K each)

-$5M on safety drivers (assuming $20/hr, 30 mph)

-$1M on fuel (assuming 30 mpg, $4/gal)

-$5M on insurance (no idea)

-$5M for a garage

-$5M for a couple dozen techs/engineers working in the garage

-$10M overhead, supplies, other costs?

And this total of ~$60M is with a bunch of upfront fixed costs (mostly cars, but also the garage). Year two the cost would drop in half to ~$30M.

By my math, if you had a working system and were bottlenecked only by data, you could catch up to Waymo's 7M miles for just $60M. Maybe my estimates are way off and actually it's $100M. Or even $200M. That's expensive, but I'm not so sure it's a moat when we're talking about companies with 10s of billions of cash on hand chasing a market that could eventually be a trillion dollars.

At this stage, I think if you have a working system, then by my math the cost per training mile is well under $10 per mile.

I wouldn't call that a moat, but maybe you do.

dzdt · on Aug 28, 2018

I agree with you: data is not so much of a moat as a fixed expense for anyone wanting to join in. And for the big automakers or Uber, it is not such a big cost to pay.

You did miss in your cost estimate scaling the compute costs of ML training to ingest that data. Also test courses and simulation to amplify and elaborate on tricky cases found in the data. Adding those things in adds 10's of millions to the estimate but doesn't change the fundamental analysis.

I think the real moat is in fleet networked data. If you have a lot of cars on the road and they are sharing info about strange situations to expect, it could be a big advantage for how "smart" the driving seems. I am thinking things like broken stoplights or badly placed traffic cones. A networked solution can alert other cars of the puzzling condition and the interpretation. Then later cars passing that location can proceed more confidently than if each vehicle has to work out an interpretation itself.

svantana · on Aug 28, 2018

Arguably it's not a pure ML problem where you can just learn what to do from human drivers. You need to know what the alternatives are, which actions are potentially dangerous etc. So you run your "beta" system and record when the safety driver takes over. Then you improve the system from that and drive some more. These iterations will take a lot of time, and you can't just scale it with more engineers (see The Mythical Man Month etc).

_ugfj · on Aug 30, 2018

Maybe. It took them a long time to build up the current one so I am not sure it was only 60M. Also, what about the machine learning model they built from it?