This is why I personally feel like Tesla's approach is more likely to "win". The fundamental blocker to self-driving cars is not sensing / sensor fusion, it is intelligence. And the Tesla approach seems much more likely to achieve functional intelligence than Waymo's.

I like both approaches. The fact that both exist is a clear win for consumers.

Tesla's approach seems like a bet that A) AI will reach human-level driving intelligence before lidar becomes cost-efficient, in which case their current sensors will be sufficient to achieve at least human-level performance; and B) ~human-level performance will be sufficient to achieve large-scale consumer and regulatory acceptance. Waymo seems to be taking the other side of that bet.

If Tesla is right, their solution should scale faster, and they can worry about adding superhuman sensory capabilities later. If Waymo is right, all the Cybercabs that Tesla is pumping out right now are destined for the scrapyard, or at best will spin their wheels in beta testing for years while Waymo speeds ahead.

Tesla is putting its money on the bull case for self-driving as a whole. If Tesla wins that bet, it means we all get access to a useful version of the tech years earlier. If Waymo wins, that's great too, but it means that for better or worse lidar will be a bottleneck to scaling the tech.

The whole thing is basically a rehash of Intel vs TSMC on EUV in the 2010s.

While I agree with basically all of this, and find the FSD on my Tesla to be quite useful, a question pops into my mind.

Why can't Waymo ALSO develop the same smarts and just also solve the sensor fusion issue such that they can use the right set of sensors in the right environmental conditions, and then leapfrog Tesla's capabilities?

I thought about this and I think it boils to how the model is trained.

Tesla trains it models from actual drivers purely based on (input) Vision and (output) actuators - Brake, Steering, Accelerators.

Human output is based on what they and the camera sees. So, it's a 1:1 match.

If Waymo were to do that, it'll muddle the training set. The Lidar input may override camera input.

I always struggled when Musk mentioned Lidar will make it ambiguous. It didn't make any sense to me why having a secondary failback sensor messes things. But, if you put it in the training data context, it absolutely makes sense.

This is an interesting viewpoint, but isn't it also solveable?

Just because the human in the scenario only took vision as input, why does that matter to the training data and the model? The actions are the same.

To put it another way, what about all the cultural context the human had, or the sounds, smells, past experiences at the same intersection, etc? Even Tesla can't record this, but I'm not sure that matters.

The biggest issue with using both camera and lidar is how to properly resolve conflicting returns from different sensor types.

> such that they can use the right set of sensors in the right environmental conditions

Because this part is really hard, and that's why Tesla abandoned the fusion approach. You cannot possibly foresee all the conditions in which LIDAR or any active sensor will malfunction/return wrong data/return data that's only slightly off for that ONE specific time. And even if it doesn't, you need to trust it to not return noise. And when it does return noise, how do you classify it as noise?

Cameras are passive sensors - they get whatever light comes in and turn it into an image. Camera is capturing shapes that make sense to the neural nets: it's working. See all black/white/red/cannot see any shapes? Camera is not working, exclude it from the currently used set of sensors or weigh it less when applying decisions, because it's returning no signal (and yes, neural nets have their own set of problems).

EDIT: cameras also provide more continuous context: if 1 pixel is off, is clearly bright red in a mostly-green scene where no poles can be identified, the neural net will average it out and discard it as noise. If 1 pixel says "object" in LIDAR, do you trust it to be correct? Perhaps the ray just hit a bird or a fly, but you only see a point, it's a lossy summary of the information you need.

But why can't you apply all that same logic and processing to LIDAR as well. Maybe we're not there yet, but about about in 5-10 years when we are?

There is noise on LIDAR returns too. No one considers a single LIDAR point to be a collision hazard.

Because they don't have a fleet of millions of people labeling the data for them and paying for the privilege of doing so. Waymo has about 3700 vehicles. Tesla has millions. Waymo only operates in known environments and collects a very limited range of data. Tesla collects data everywhere that people drive their cars.

They could in theory. If they put at least as much emphasis on the AI side as Tesla does. Or if someone else cracked vehicle AI wide open and left it open for them to copy, and then they did exactly that, and found a way to bolt on their extra sensors in a useful fashion while at it.

As is, Waymo's playing it smarter than Cruise did, but they're not all in on AI yet. So I don't expect them to "leapfrog Tesla" in that dimension - and it's the key dimension to self-driving.

The main reason Tesla's don't have LIDAR is hardware cost and maintenance cost, not improved safety.

Maybe also that cars with a LIDAR rig on the roof are appallingly ugly.

Tesla wants to make EVs that look like normal cars (Cybertruck being the oddball here, admittedly).

I got downvoted for saying this last time the topic came up but constraints focus a project. It’s best to start work with as few variables as possible, and only add new ones when absolutely necessary.

I'm working on a similar problem in computer vision and we're quickly approaching the point where our pure vision work is better than our Lidar supported track because we've had to deal with the constraints instead of having a crutch to lean on.

I agree, but these are also the exact constraints that lead to an early leader getting overtaken by a longer term, yet better set of plans. Not saying that's the case here, but given how much success Waymo has had so far, over really everything Tesla has produced, says quite a bit about the likelihood of the approach, even if it's not yet there.

You can have intelligence with lidar.

You can have even more intelligence with both.

Naaah, Tesla has no edge in intelligence either. It's just a PR piece to sell to investors.