Since lidar has distance information and cameras do not, it was always a ridiculous idea by a certain company to use cameras only. Lidar using cars are going to replace at least the ones that don't make use of this obvious answer to obstacle detection challenges.

Karpathy provided additional context on the removal of LiDAR during his Lex Fridman Podcast appearance. This article condenses what he said:

https://archive.is/PPiVG

And here's one of Elon's mentions (he also has talked about it quite a bit in various spots).

https://xcancel.com/elonmusk/status/1959831831668228450?s=20

Edit: My personal view is that LiDAR and other sensors are extremely useful, but I worked on aircraft, not cars.

Based on that list it boils down to 2 things it seems:

- cost (no longer a problem)

- too much code needed and it bloats the data pipelines. Does anyone have any actual evidence of this being the case? Like yes, code would be needed, but why is that innately a bad thing? Bloated data pipelines feels like another hand-wave when I think if you do it right it’s fine. As proven by Waymo.

Really curious if any Tesla engineers feel like this is still the best way forward or if it’s just a matter of having to listen to the big guy musk.

I’ve always felt that relying on vision only would be a detriment because even humans with good vision get into circumstances where they get hurt because of temporary vision hindrances. Think heavy snow, heavy rain, heavy fog, even just when you crest a hill at a certain time of day and the sun flashes you

Just for the record though, Musk isn't blindly anti-LIDAR. He has said (and I think this is an objective fact) that all existing roads and driving are based on vision (which is what all humans do). So that should technically be sufficient. SpaceX uses LIDAR for their docking systems.

I would argue that yes, we do use vision but we get that "lidar depth" from our stereo vision. And that used to be why I thought cameras weren't enough.

But then look at all the work with gaussian splatting (where you can take multiple 2d samples and build a 3d world out of it). So you could probably get 80% there with just that.

The ethos of many Musk companies (you'll hear this from many engineers that work there) is simplify, simplify, simplify. If something isn't needed, take it out. Question everything that might be needed.

To me, LIDAR is just one of those things in that general pattern of "if it isn't absolutely needed, take it out" – and the fact that FSD works so well without it proves that it isn't required. It's probably a nice to have, but maybe not required.

Humans aren't using only fixed vision for driving. This is such a tiresome thing to see repeated in every discussion about self driving.

You're listening to the road and car sounds around you. You're feeling vibration on the road. You're feeling feedback on the steering wheel. You're using a combination of monocular and binocular depth perception - plus, your eyes are not a fixed focal length "cameras". You're moving your head to change the perspective you see the road at. Your inner ear is telling you about your acceleration and orientation.

And also, even with the suite of sensors that humans have, their vision perception is frequently inadequate and leads to crashes. If vision was good enough, "SMIDSY" wouldn't be such an infamous acronym in vehicle injury cases.

For those of us not aware of Australian cycling jargon, "SMIDSY" means "Sorry, Mate, I Didn't See You".

the issue is clearly attention not vision when it comes to humans. if we could actually process 100% of the visual information in our field of view, then accidents would probably go down a shit load.

Attention is perhaps the limiting factor, but being able to look in two directions at once would help, and would help greatly if we had more attention capacity. E.g. anytime you change lanes you have to alternate between looking behind, beside, and in front and that greatly reduces reaction time should something unexpected happen in the direction you aren't currently looking...

Humans have both issues. There are many human failures which are distinctly a vision issue and not attention related, e.g. misestimation of depth/speed, obscured or obstructed vision, optical focus issues, insufficient contrast or exposure, etc.

But how many of those crashes not caused by inattention could have been avoided with less idiocy and more defensive driving? I mean, yes, we can’t see as well in fog, but that’s why you should slow down

Again, I'm still not saying that humans don't make bad decisions. I'm saying that, unequivocally, they also get into accidents while paying attention and being careful, as a result of misinterpretation or failure of their senses. These accidents are also common, for example:

* someone parking carefully, misjudges depth perception, bumps an object

* person driving at night, their eyes failed to perceive a poorly lit feature of the road/markings/obstacles

* person driving and suddenly blinded by bright object (the sun, bright lights at night)

* person pulling out in traffic who misinterprets their depth perception and therefore misjudges the speed of approaching traffic

* people can only focus their eyes at one distance at a time, and it takes time to focus at a different distance. It is neither unsafe nor unexpected for humans to check their instruments while driving -- but it can take the human eye hundreds of milliseconds to focus under normal circumstances -- If you look down, focus, look back up, and focus, as quick as you can at highway speeds, you will have travelled quite a long distance.

These type of failures can happen not as a result of poor decision making, but of poor perception.

In theory, a computer should be able to do the same. It could do sensor fusion with even more sense modalities than we have. It could have an array of cameras and potentially out-do our stereo vision, or perhaps even use some lightfield magic to (virtually) analyze the same scene with multiple optical paths.

However, there is also a lot of interaction between our perceptual system and cognition. Just for depth perception, we're doing a lot of temporal analysis. We track moving objects and infer distance from assumptions about scale and object permanence. We don't just repeatedly make depth maps from 2D imagery.

The brute-force approach is something like training visual language models (VLMs). E.g. you could train on lots of movies and be able to predict "what happens next" in the imaging world.

But, compared to LLMs, there is a bigger gap between the model and the application domain with VLMs. It may seem like LLMs are being applied to lots of domains, but most are just tiny variations on the same task of "writing what comes next", which is exactly what they were trained on. Unfortunately, driving is not "painting what comes next" in the same way as all these LLM writing hacks. There is still a big gap between that predictive layer, planning, and executing. Our giant corpus of movies does not really provide the ready-made training data to go after those bigger problems.

Putting your point another way, in order to replicate an average human driver’s competence you would need to make several strong advancements in the state of the art in computer vision _and_ digital optics.

In India (among others), honking is essential to reducing crashes

We often greatly underestimate / undervalue the role of our ears relative to vision. As my film director friend says, 80% of the impact in a movie is in the sound

The day a Waymo can functionally navigate the streets of Mumbai is when we really have achieved l5

Most of what you said has nothing to do with lidar vs camera

20 meters away motion vision is more accurate than stereoscopic vision. What is lidar helping to solve here?

Waymo claims its system, which uses a combination of LIDAR & vision, resolves objects up to 500 meters away

https://waymo.com/blog/2024/08/meet-the-6th-generation-waymo...

This company claims their LIDAR works conservatively at 250m, and up to 750m depending on reflectivity

https://www.cepton.com/driving-lidar/reading-lidar-specs-par...

> So that should technically be sufficient

Sufficient to build something close to human performance. But self driving cars will be held to a much higher standard by society. A standard only achievable by having sensors like LiDAR.

if a self driving car had the exact vision of humans it would still be better because it has better reaction times. never mind the fact that humans cant actually process all the visual information in our field of view because we dont have the broad attention to be able to do that. its very obvious that you can get super human performance with just cameras.

Whether thats worth completely throwing away LiDAR is a different question, but your argument is just obviously false.

This reminds me of the time I was distantly following a Waymo car at speed on 101 in Mountain View during rush hour. The Waymo brake lights came on first followed a second or two later by the rest of the traffic.

Better reaction times only matter if the decisions are the same / better in every case. Clearly we are not there on that aspect of it yet.

Deciding to crash faster, or "tell human to take over" really fast is NOT better.

Even if they weren’t going to be held to a higher standard for widespread acceptance, tens of thousands of people a year in the us die due to humans driving badly. Why would we not try to do better than that?

Because that's an acceptable loss and better costs more!

LIDAR also struggle in heavy rain, snow, fog, dust. Check how waymo handle such conditions.

It's not only failing, it's causing false positives.

Sufficient if all else were equal. But the human brain and artificial neural networks are clearly not equal. This is setting aside the whole question of whether we hope to equal human performance or exceed it.

Teslas have at least 3 forward facing cameras giving them plenty of depth vision data.

They also have several cameras all around providing constant 360° vision.

To do gaussian splatting anywhere near in real time, you need good depth data to initialize the gaussian positions. This can of course come from monocular depth but then you are back to monocular depth vs lidar.

Mentioning gaussian splatting for why we don't need lidar depth is a great example of Musk-esque technobabble; surface level seemingly correct, but nonsense to any practitioner. Because one of the biggest problems of all SfM techniques is that the results are scale ambiguous, so they do not in fact recover that crucial real-world depth measurement you get from lidar.

Now you might say "use a depth model to estimate metric depth" and I think if you spend 5 minutes thinking about why a magic math box that pretends to recover real depth from a single 2D image is a very very sketchy proposition when you need it to be correct for emergency braking versus some TikTok bokeh filter you will see that also doesn't get you far.

This is not really true if you have multiple cameras with a known baseline, or well known motion characteristics like you get with an accelerometer+ wheel speed.

Why is this getting downvoted? It's good faith and probably more accurate than not.

> and the fact that FSD works so well without it proves that it isn't required

The reports that Tesla submits on Austin Robotaxis include several of them hitting fixed objects. This is the same behavior that has been reported on for prior versions of their software of Teslas not seeing objects, including for the incident for which they had a $250M verdict against them reaffirmed this past week. That this is occurring in an extensively mapped environment and with a safety driver on board leads me to the opposite conclusion that you have reached.

If Waymo proven their model works, why the silly automaker is doing several orders of magnitude more autonomous miles?

They aren't. Tesla has logged some 800k total miles with their robotaxi vehicles, including miles with safety drivers. Waymo has logged 200M driverless miles. That's 0.4% of the mileage, with the most generous possible framing.

My understanding is that there's more data processing required with cameras because you need to estimate distance from stereoscopic vision. And as it happens, the required chips for that have shot up in price because of the AI boom.

But I think costs were just part of the reason why Elon decided against Lidar. Apparently, they interfere with each other once the market saturates and you have many such cars on the same streets at the same time. Haven't heard yet how the Lidar proponents are planning to address that.

How does Waymo handle it now? There are many videos of Waymo depots with dozens of cars not running into each other.

Lidar critics like to pretend that anti-collision is not a well-studied branch of Computer Science and telecoms. Wifi, Ethernet and cellphones all work well simultaneously, despite participants all sharing the same physical medium.

[deleted]

The points linked repeatedly focus on cost and complexity as justification, even explicitly stating musks desire to minimise components in Kaparthy’s list.

They don’t focus on safety or effectiveness except to say that vision should be ‘sufficient’. Which is damning with faint praise imho.

If that link was to try and argue that the removal of sensors makes perfect sense i have to point out that anyone that reads that would likely have their negative viewpoint hardened. It was done to reduce cost (back when the sensors were 1000’s) and out of a ridiculous desire by Musk for minimalism. It’s the same desire that removed the indicator stalk i might add.

To be clear, from a personal standpoint, I am pro-more sensors and sensor fusion.

I assume Musk, et al are acting in best faith in trying to find the right compromises.

Why would you assume Musk is acting in good faith? That’s very much not his thing.

Oh, you sweet summer child..

Instead of betting on RADAR and LIDAR HW getting better and cost going down, they went with vision only approach. Everybody in this field knows the strengths and weakness of each system. Multi-modal sensor fusion is the way to go for L4 autonomy. There is no other way to reduce the risk. Vision only will never be able to achieve L4 in all the weather conditions. Tesla may try to demonstrate L4 in limited geography and in good weather conditions but it won't scale.

The reasoning is cynical but sound. If the system uses only the sensing modes people have, it will make the mistakes people do. If a jury thinks "well I could have done that either!" You win. It doesn't matter if your system has fewer accidents if some of the failure modes are different than human ones, because the jury will think "how could it not figure that out?"

I don't think that's the reasoning.

The reasoning was simply that LIDAR was (and incorrectly predicted to always be) significantly more expensive than cameras, and hypothetically that should be fine because, well, humans drive with only two eyes.

Musk miscalculated on 1) cost reduction in LIDAR and 2) how incredible the human brain is compared to computers.

Having similar sensors certainly doesn't guarantee your accidents look the same, so I don't think your logic is even internally sound.

Sensor fusion is also hard to get right, since you still need cameras you have to fuse the two information streams. Thats mainly a software problem and companies like Waymo have done it, but Tesla was having trouble with it earlier, and if you don’t do it right, your self driving system can be less reliable.

Sensor fusion seems like it'd be a big problem when you're handcoding lots of C++, and way less of a problem when all the sensors are just feeding into one big neural network, as Tesla and probably others are doing now. The training process takes care of it from there.

One of Udacity's first courses was on self-driving, taught by Sebastian Thrun who later cofounded Waymo. He went through some Bayesian math that takes a collection of lidar points, where each point contributes to a probabilistic assessment of what's really going on. It's fine if different points seem to contradict each other, because you're looking for the most likely scenario that could produce that combined sensor data. Transformers can do the same sort of thing, and even with different sensor types it's still the same sort of problem.

> Sensor fusion is also hard to get right, since you still need cameras you have to fuse the two information streams

The response to the challenge shouldn't be whittling down your sensor-suite to a single type, but to get good at sensor fusion.

I think this is the key. In theory - more information stream when fused together (properly) should reduce error. If their stumbling block is the "properly" part, than the rest of those justifications come off as a pretty weak way to sidestep their own inabilities to deliver this properly.

We have lots of evidence of similar strategies being used in other domains, this seems like an especially life-critical domain that ought to have high rigor and standards applied.

> how incredible the human brain is compared to computers.

It is pretty incredible but people will (rightly so?) hold automated drivers to an ultra high standard. If automated driving systems cause accidents at anywhere near the human rate, it'll be outlawed pretty quickly.

> If automated driving systems cause accidents at anywhere near the human rate, it'll be outlawed pretty quickly.

This is evidently false. Robotaxi crash rates exceed human drivers', but there's not an effective regulatory agency to outlaw them!

https://futurism.com/advanced-transport/tesla-robotaxis-cras...

According to that article, Waymo crashes 2.3x more often than human drivers (every 98k miles vs 229k miles), which is clearly false. I think it's far more likely that humans don't report most minor collisions to insurance, and that both Robotaxis and Waymo are safer than human drivers on average.

> According to that article, Waymo crashes 2.3x more often than human drivers (every 98k miles vs 229k miles), which is clearly false.

Why is it clearly false? It might be false, but clearly? I would definitely like to see evidence either way.

> I think it's far more likely that humans don't report most minor collisions to insurance, and that both Robotaxis and Waymo are safer than human drivers on average.

That sounds like you are trying to find reasons to get the conclusion you want.

The NHTSA requires a report when any automated driving system hits any object at any speed, or if anything else hits the ADS vehicle resulting damage that is reasonably expected to exceed $1,000.[1] In practice, this means that everyone reports any ADS collision, since trading paint between two vehicles can result in >$1k in damage total.

If you go to the NHTSA's page regarding their Standing General Order[2] and download the CSV of all ADS incidents[3], you can filter where the reporting entity is Waymo and find 520 rows. If you filter where the vehicle was stopped or parked, you'll find 318 crashes. If you scan through the narrative column, you'll see things like a Waymo yielding to pedestrians in a crosswalk and getting rear-ended, or waiting for a red light to change and getting rear-ended, or yielding to a pickup truck that then shifted into reverse and backed into the Waymo. In other words: the majority of Waymo collisions are due to human drivers.

So either Waymos are ridiculously unlucky, or when these sorts of things happen between two human driven cars, it's rarely reported to insurance. In my experience, if there's only minor damage, both parties exchange contact info and don't involve the authorities. Maybe one compensates the other for damage, or maybe neither party cares enough about a minor dent or scrape to deal with it. I've done this when someone rear-ended me, and I know my parents have done it when they've had collisions.

If human driven vehicles really did average 229k miles between any collision of any kind, we'd see many more pristine older vehicles. But if you pay attention to other cars on the road or in parking lots, you'll see far more dents and scratches than would be expected from that statistic. And that's not even counting the damage that gets repaired!

1. See page 13 of https://www.nhtsa.gov/sites/nhtsa.gov/files/2025-04/third-am...

2. https://www.nhtsa.gov/laws-regulations/standing-general-orde...

3. https://static.nhtsa.gov/odi/ffdd/sgo-2021-01/SGO-2021-01_In...

Definitely. I looked at Tesla's source for these numbers, looks like they primarily used data sourced from police reports, which most people only file if the incident is serious enough to turn into insurance.

Tesla notes:

> These assumptions may contain limitations with respect to reporting criteria, unreported incident estimations (e.g., NHTSA estimates that 60% of property damage-only crashes and 32% of injury crashes are not reported to police

https://www.tesla.com/fsd/safety

> Musk miscalculated on 1) cost reduction in LIDAR

Given that Musk has a history of driving lower costs, it's unlikely he overestimated the long-term cost floor. He just thought we were close to self-driving in 2014.

Another factor is Andrej Karpathy, who was the primary architect for the vision-only approach. Musk wanted fewer parts, and Karpathy believed he could deliver that. Karpathy is still an advocate of vision-only.

Right, for the reasons that I just mentioned

Musk has never been scared of vertically integrating something that's too expensive initially.

> Musk miscalculated on 1) cost reduction in LIDAR and 2) how incredible the human brain is compared to computers.

And, less excusable, ignorant of how incredible human eyes are compared to small sensor cameras. In particular high DR in low light, with fast motion. Every photographer knows this.

And also ignorant about how those two eyes have binocular vision, adjustable positions, and can look in multiple mirrors for full spatial awareness.

There are good arguments but this isn’t one. Many humans (like me!) drive fine without binocular vision. And the cars have many cameras all around, with wide angle lenses that are watching everything all the time, when a human can only focus in one direction at a time.

I thought only the front view has binocular vision on the cars. The others are single, with no depth perception. How does it know how close objects are outside this forward cone?

https://www.researchgate.net/publication/378671275/figure/fi...

So your eye does not have an adjustable position and you cannot use mirrors?

Both are easily compensated for by having many cameras.

Binocular vision is not only relevant for driving (well, maybe for the steering wheel, but that's not the point).

It gives us depth perception. And moving the eyes and/or head gives the depth perception over a wide field of view.

Eh, I think ‘miscalculation’ might be giving too much credit about good intentions.

He wanted (needed?) to get on the hype train for self driving to pump up the stock price, knew that at the time there was zero chance they could sell it at the price point lidar required at the time - or even effective other sensors (like radar) - and sold it anyway at the price point that people would buy it at, even though it was not plausibly going to ever work at the level that was being promised.

There is a word for that. But I’m sure there are many lawyers that will say it was ‘mere fluffery’ or the like. And I’m sure he’ll get away with it, because more than enough people are complicit in the mess.

Miscalculation assumes there was a mistake somewhere, but near as I can tell, it is playing out as any reasonable person expected it too, given what was known at the time.

I think Musk is really not as smart as he thinks he is and this specific thing was probably an earnest mistake. Lots of other fraudulent stuff going on though of course!

IMHO not using lidars sounds like a premature optimisation and a complication, with a level of hubris.

This is a difficult problem to solve and perhaps a pragmatic approach was/is to make your life as simple as possible to help get to a fully working solution, even if more expensive, then you can improve cost and optimise.

Considering he also runs a company that puts computer chips inside brains to augment them you’d think he ought to have a more sound understanding as to the limits of both.

There certainly is a pretty on going miscalculation regarding human intelligence, and consrquentially, empathy.

Seeing the SOTA in FSD techs it is not obvious that Musk made a miscalc so far.

Nah

If the data were positive for Tesla, Tesla would publish it

They do not, so one can infer it is not flattering

(Before you post the "Miles driven with FSD" chart, you should know upfront (as Tesla must) that chart doesn't normalize by age of vehicle or driving conditions and is therefore meaningless/presumably designed to deceive)

Until a lawyer points out other cars see that. My car already has various sensors and in manual driving sounds alarms if there is a danger I seem not to have noticed. (There are false alarms - but most of the type I did notice and probably should have left more safety margin even though I wouldn't hit it)

also regulators gather srastics and if cars with something do better they will mandate it.

Very recent issue with Waymo https://dmnews.co.uk/waymo-robotaxi-spotted-unable-to-cross-.... This is 17 years after they bet the farm on LIDAR, with no signs its ever going to be cost effective or that it's better than multiple cameras, with millisecond reaction 360 degrees, that never gets tired, drunk, distracted, and also has other cheaper sensors and NN trained on Billions or real world data.

Tesla does not handle rain well either. This is not a LIDAR problem, it is a problem with self driving cars in general.

My Tesla can't even tell if it should turn the wipers on consistently or correctly. Let alone drive in the rain.

A feature that is bulletproof in other cars with a very boring and industry standard sensor (it's not even expensive), while Tesla insisted they could do it with just normal cameras.

Seriously. Why do people think a company that can't do automatic wipers could possibly do automatic driving?

The same people that seriously thought we’d have a mars base by now.

People also don't handle rain well.

That's an example of it failing safe. I'd rather it did that than drive me into a sinkhole because it thought it was a puddle.

Ok so Waymo is useless in the rain then, kind of limiting. But at least that 0.000000000001% times it actually is a sinkhole you won't damage the bumper.

I'd rather a Waymo be useless in the rain rather than a Tesla be actively dangerous and likely to kill me.

Tesla ""autopilot"" fatalities: 65

Waymo fatalities: 0

Autopilot isn’t full self driving (FSD), most cars these ship with smart cruise control (what autopilot basically is). Do you have fatality statistics for FSD?

If we are just talking about smart cruise control, most cars are using cameras and radar, not lidar yet. But Tesla is special since it doesn’t even use radar for its smart cruise control implementation, so that could make it less safe than other new cars with smart cruise control, but Autopilot was never competing with Waymo.

> Waymo fatalities: 0

By some measures Waymo is actually at -1 fatalities. There has been one confirmed birth of a child in a Waymo. https://apnews.com/article/baby-born-waymo-san-francisco-6bd...

I think the car would have to be more actively involved in the process for that to count. :)

There is also a report from the same flooding in LA of a Waymo driving into a flooded road and getting stuck.

They might have flipped a switch after that, causing this.

Dude that's not a 'puddle' as the article claims, that's a body of water that it's not even visually obvious whether it's safe to drive through. Maybe I'm a bad driver but I'd hesitate to drive through that in a small car either.

I think the difference is the prior knwoldege a commuter has of that section of road. Does it always flood shallowly in heavy rain?

Even without prior knowledge, seeing others safely navigate the same section will lower your estimated risk.

The amount of water will depend on the rain, so we don't know how shallow it is even with prior knowledge.

If you drive the road every day, you probably do. If you can see someone drive through it (perhaps someone who knows the area well and knows how deep it is based on puddle width), you definitely do.

[deleted]

>A vehicle got stuck trying to figure out an obstacle so sensors with less information are better than sensors with more information.

It is sound to think that cameras plus an accelerometer, plus data about about the car and environment (that you get from your ears) ought to be able to mimic and improve on human driving. However humans general purpose spatial awareness and ability to integrate all kinds of general information is probably really hard to replicate. A human would realize that an orange fluid spilling across the road might be slippery, guess the way a person might travel from the way their eyes are pointing...

It may just be faster to make lidar cheap. And lidar can do things humans can't.

IIUC, the cameras in a Tesla have worse vision (resolution) at far distances than a human. So while in the abstract your argument sounds fine; it'll crumble in court when a lawyer points out a similar driver would've needed corrective lens.

Most accidents happen because people are human, aren't paying attention, are inebriated, not experienced enough drivers, or reckless.

It's not fair to say that vision based models will "make the same mistakes people do" as >99% of the mistakes people make are avoidable if these issues were addressed. And a computer can easily address all those issues

Which means the mistakes vision-based models for today are unique to them.

This is a new and flawed rationale that I haven't heard before. Tesla cameras are worse (lower resolution, sensitivity, and dynamic range) than human eyes and don't have "ears" (microphones).

The cars do have at least one microphone.

Inside the car though, right? With multiple exterior microphones they could do spatialization like Waymo.

Pretty hard to do if your whole selling point is ‘better and safer than human’ however?

> Since lidar has distance information and cameras do not, it was always a ridiculous idea by a certain company to use cameras only

Human eyes do not have distance information, either, but derive it well enough from spatial (by ‘comparing’ inputs from 2 eyes) or temporal parallax (by ‘comparing’ inputs from one eye at different points in time) to drive cars.

One can also argue that detecting absolute distance isn’t necessary to drive a car. Time to-contact may be more useful. Even only detecting “change in bearing” can be sufficient to avoid collision (https://eoceanic.com/sailing/tips/27/179/how_to_tell_if_you_...)

Having said that, LiDAR works better than vision in mild fog, and if it’s possible to add a decent absolute distance sensor for little extra cost, why wouldn’t you?

Human/animal vision uses way more than parallax to judge distances and bearings - it uses a world model that evolved over millions of years to model the environment. That's why we can get excellent 3D images from a 2D screen, and also why our depth perception can be easily tricked with objects of unexpected size. Put a human or animal in an abstract environment with no shadows and no familiar objects, and you'll see that depth perception based solely on parallax is actually very bad.

> Human eyes do not have distance information

Single human eyes do resolve depth perception. Not as good as binocular vision, but you don't loose all depth perception of you lose an eye.

https://en.wikipedia.org/wiki/Monocular_vision

Human eyes are much better than cameras at dealing with dynamic range. They’re also attached to a super-computer which has been continuously trained for many years to determine distances and classify objects.

I don’t like the comparison between humans and humans. Humans don’t travel around at 100mph in packs of other humans. Why not use every sensor type at our disposal if it gives us more info to make decisions? Yes I understand it’s more complicated, but we figure stuff out.

Let me know when you have a camera package with human eye equivalency.

As I understand, lidars don't work well in rain/snow/fog. So in the real world, where you have limited resources (research and production investment, people talent, AI training time and dataset breadth, power consumption) that you could redistribute between two systems (vision and lidar), but one of the systems would contradict the other in dangerous driving conditions — it's smarter to just max out vision and ignore lidar altogether.

> lidars don't work well in rain/snow/fog.

Neither do cameras, or eyeballs.

When it's not safe to drive, it's not safe to drive.

I've been in zero-road-speed whiteout conditions several times. The only move to make is to the side of the road without getting stuck, and turning on your flashers.

Low-light cameras would not have worked. Sonar would not have worked. Infrared would not have worked.

I think the weather where cameras/sensors start having problems is much better than zero-vis whiteout.

If we could make sensors that lets an autonomous vehicle drive reliably in any snow/rain where a human could drive (although carefully) then we're good. But we are a long way from that. Especially since a lot of sensor tech like cameras tend to fail in 2 ways, both through their performance being worse in adverse condition but also simply failing to function at all if they are covered in ice/snow/water.

Radar might still have worked

If you have multi-return lidar, you can see through certain occlusions. If the fog/rain isn't that bad, you can filter for the last return and get the hard surface behind the occlusion. The bigger problem with rain is that you get specular reflection and your laser light just flies off into space instead of coming back to you. Lidar not work good on shiney.

No, it isn't "smarter." Camera-only driving is the product of a stubborn dogmatic boss who can't admit a fundamental error. "Just make it work" is a terrible approach to engineering.

[deleted]

Can hatred of Musk not derail this entire thread please? I have a camera-only ADAS that I think works quite well, but having both would be better.

Criticism of Musk isn't hate of Musk. The point is completely valid and the results of this management style infuses all of his businesses albeit with differing results.

It's significant that a truly hard problem like autonomous driving doesn't respond to a "brute force" management style. Rockets aren't in this category because the required knowledge and theory is fairly complete, whereas real autonomous driving is completely novel.

Shoe, meet foot.

I don't know what that means

https://en.wiktionary.org/wiki/if_the_shoe_fits,_wear_it

Oh, that's silly. I don't own a Tesla. I just wanna talk about LIDAR without people ragebaiting about Elon.

> without people ragebaiting about Elon

Hmm. Is it ragebaiting to respond to a tired and wrong statement by saying that it's tired and wrong and that the situation is merely the product of piss poor management decisions? People get understandably frustrated seeing the same wrong talking point that people with domain knowledge in computer vision and robotics have repeatedly explained is wrong in extremely fundamental ways.

> I don't own a Tesla.

n.b. The shoe/foot comment was not about you. It was about Musk. It wouldn't make any idiomatic sense for the expression to be about you given what you said and what you were responding to. If they'd said "pot, meet kettle", then it would have been about you. In that context, saying that you don't own a Tesla feels like a weird thing for you to insert in your comment. It potentially comes across as suspiciously defensive.

suspiciously defensive??? you got me. Or maybe I just didn't understand their comment.

I'm just trying to help you out here, friend.

Why does this matter? You have to slow down in rain/snow/fog anyway, so only having cameras available doesn't hurt you all that much. But then in clear weather lidar can only help.

If your vision is good enough to drive in rain/snow/fog, you don't need lidar in clear conditions. If you planned to spend $10B on vision and $10B on lidar — you would be better off spending $20B on better vision.

We have actual proof this isn’t true. Waymo is light years ahead of Tesla despite spending less.

Tesla is spending upwards of $6B/year to Waymo’s $1.5B. Only one of these companies makes an autonomous robotaxi that’s actually autonomous.

Yes, but how much of that is due to the lidar vs camera choice?

It still infuriates me that Tesla went so long being able to call their feature “auto pilot.“ Then they had the audacity to call it user error when people thought the car would automatically pilot itself.

> If yo[u can] drive in rain/snow/fog, you don't need lidar in clear conditions

Of course you do, you're driving at much higher speeds and so is the surrounding traffic. You can't just guess what you might be looking at, you have to make clear decisions promptly. Lidar is excellent in that case.

Nothing works perfectly in all conditions and scenarios. Sensor fusion has been the most logical approach now, and into the foreseeable future.

Computer vision does not work exactly like human vision, closely equating the two has tended to work out poorly in extreme circumstances.

High performance fully automated driving that relies solely on vision is a losing bet.

Why does that strategy absolutely require the lidar to be absent from the car? When was less technology the solution to a software problem?

People who don't understand that sensor fusion is an entire field of study with tons of existing work and lots of expertise have been fooled by a fake argument of "If the camera and lidar disagree, what do you do?"

It's frustrating to still see it repeated over a decade later. It was always bullshit. It was always a lie.

Limited resources? Billions per year are being thrown at the base technology. We have the capital deployed to exhaust every path ten times over.

Even if so, it doesn't mean that capital deployment efficiency and expected payoff make equal sense in all directions.

Then again, it's good that we have self-driving companies with lidar and without — we will find out which approach wins.

We have already found out, Waymo is SAE Level 4, Tesla is SAE Level 2

The Swiss cheese model would like to disagree.

When you have sensor ambiguity sounds like the perfect time to fail safely and slow to a halt unless the human takes over.

Evidence clearly shows otherwise.

Also, military sensor use shows the best answer is to have as many different types of sensors as possible and then do sensor fusion. So machine vision, lidar, radar, etc.

That way you pick up things that are missed by one or more sensor types, catches problems and errors from any of them, and end up with the most accurate ‘view’ of the world - even better than a normal human would.

It’s what Waymo is doing, and they also unsurprisingly, have the best self driving right now.

Do cameras work well in those conditions? Nope. Also cameras don't work well with certain answer of glare, so as a consumer I'd rather have something over-engineered for my safety to cover all edge cases...

This is silly. Cameras are cheap. Have both. Sensors that sense differently in different conditions is not an exotic new problem. The kalman filter has existed for about a billion years and machine learning filters do an even better job.

Cameras are cheap, but, as I understand:

1) it's not cheap to produce lidars at a stable predictable quality in millions;

2) car driving training data sets for lidars are much scarcer (and will always be much scarcer due to cameras' higher prevalence) and at a much lower quality;

3) combined camera+lidar data sets are even scarcer.

> 1) it's not cheap to produce lidars at a stable predictable quality in millions;

It wasn't cheap to produce accelerometers at a stable predictable quality in millions before smart phones either. Mass production shakes things up somewhat. See the headline for reference.

Doesn’t that make it a sensible long term play to equip your car with $200 LIDAR and start gathering that data as a competitive advantage?

Yeah, this is all about Musk not wanting to admit he was wrong.

1. Automotive LiDAR is down to $350 in China already. BYD is starting to put LiDAR in even entry level cars. (It's been in their mid and high end cars for a while).

2+3. BYD collects extensive training data from customers, much like Tesla does. They will have no trouble with training.

I'm not an expert on ML vision, but I do have a Tesla and it seems to be able to tell how far away things are just fine. I'm not sure what would be wrong with the vision system that lidar needs to fix.

The phantom braking issue with auto pilot tells me it can’t. A shadow from a tree doesn’t trigger your brakes locking up at 70+ mph when there’s a lidar sensor to tell you it’s not a physical object.

“Just buy FSD” isn’t a reasonable answer to a problem literally no other automaker suffers from.

Stopped using autopilot because of the phantom braking.

It's also recently gotten much worse at lane departure sensing, often confused by snow or slightly faded road markers. Not pleasant to have the alarms go off while calmly and safely driving.

How do you explain the reports of Robotaxis running into fixed objects? If what you are saying is true that shouldn't be able to happen.

https://electrek.co/2026/02/17/tesla-robotaxi-adds-5-more-cr...

Luckily everyone else in the comments is an expert. And also doesn't recognize that Tesla's already drive themselves and did not need Lidar. They also mischaracterize the reasoning.

> I'm not an expert on ML vision, but I do have a Tesla

Well, you did get a chuckle out of me, so that's something!

Yeah it's BS. Tesla uses lidar where it makes sense: They have a small lidar fleet to collect ground truth depth data for better vision estimation. This part is long solved.

> I'm not sure what would be wrong with the vision system that lidar needs to fix.

This conversational disconnect is as old as the hills:

1. Person 1 asks "what's wrong" (if it ain't broke don't fix it)

2. Person 2 wants to make something better

My meta-goal here on HN (and many places where people converse) is for people to step back and recognize the conversational context and not fall into the predictable patterns that prevent us from making sense of the world as best as we can.

Yea, even in the case they could match human level stereo depth perception with AI, why would they say "no" to superhuman lidar capabilities. Cost could be a somewhat acceptable answer if there wouldn't be problems with the camera only approach but there are still examples of silly failures of it. And if I remember correctly they also removed their other superhuman radar in their newer models, the one which in certain conditions was capable of sensing multiple cars ahead by bouncing the signal below other cars.

Because they don't have superhuman LIDAR. They never did. Nobody ever did. LIDAR input is not completely reliable so what do you do then?

There are more practical difficulties than just cost. If you have lidar it must be calibrated relative to all other sensors. Bumps in the road, weather, thermals, this all causes drift which is non trivial. Waymos are constantly brought in and recalibrated. The advantage of camera only is you have less moving parts which is not insignificant.

But cost isnt the issue as much.

It's not that simple. Cameras don't report 3D depth, but these AI models can and do pick up on pictorial depth cues. LiDAR is incredibly valuable for collecting training and validation data, but may also make only an insignificant difference in production inference.

Stereo cameras? My 2015 Subaru has them to detect obstacles and it works great.

Just say Tesla, why censor yourself.

I have a suspicion here on HN. When criticizing big tech, especially Google and FB, at a certain time of the day a specific cohort comes online and downvotes. Suspiciously, that is a time when one could conclude, that now people in the US start working or come online. Either fanboys, employees or an organized group of users trying to silence big tech criticism.

I have no proof of course and it might be coincidence, or just difference of mindset between US citizens and Europe citizens. It happened a few times already and to me looks sus.

But if they actually read and not just ctrl+f <company name>, then of course not writing the company name, but hinting at it in an obvious way is no more helpful either.

I have seen this happening multiple times, some to fairly reasonable comments with a just tiny negative tone.

There is also flagging abuse which effectively kills the comment /post.

I know for a fact at least 1 bigger US company has a bot in slack that brings up any mentions of $companyname on hackernews...

It's been my experience that hn and reddit have a very high overlap in audience these days. The jerrybreakseverything crowd. Anything anti-tesla, anti-grok, is applauded.

Yeah, I agree with GP, pretty much anything that isn't effusively praising tesla or elmu etc will tend to get reflexively downvoted.

considering cameras can create reliable enough distance measurements AND also handle all the color reception needed for legally driving roads it was always a ridiculous idea by a certain set of people that lidar is necessary.

No, cameras cannot create reliable distance measurements in real-world conditions. Parallax is not a great way to measure distance for fast, unpredictably moving objects (such as cars on the road). And dirt or misalignment can significantly reduce accuracy compared to lab conditions.

Note that humans do not rely strictly on our eyes as cameras to measure distances. There is a huge amount of inference about the world based on our internal world models that goes into vision. For example, if you put is in a false-perspective or otherwise highly artifical environment, our visual acuity goes down significantly; conversely, people with a single eye (so no parallax-based measurement ability) still have quite decent depth perception compared to what you'd naively expect. Not to mention, our eyes are kept very clean, and maintain their alignment to a very high degree of precision.

I don't think they meant literally cameras only can create reliable distance measures. At the risk of putting words in their mouth, I would guess they meant "cameras as the only input to a distance model". the "model" doing all the heavy lifting, covering the points that you quite rightly point out are needed

Several companies, most notably Tesla, have done this well enough to drive in all manner of traffic. I'm not going to comment about if lidar is strictly needed or not to achieve better-than-human safety, that's yet to be proven one way or another by anyone. The point is that cameras + local inference can do a pretty good job at distance estimation

Stereo cameras are useless against repeating patterns. They easily match neighboring copies. And there are lots of repeating or repeating-like patterns that computers aren't smart enough to handle.

You can solve this by adding an emitter next to the camera that does something useful, be it just beaconing lights or noise patterns or phase synced laser pulses. And those "active cameras" are what everyone call LIDARs.

There are tons of evidence showing that cameras are alone are not safe enough and even Tesla has realized that removing lidar to save cost was a mistake.

'cameras can see in color, therefore lidar is unnecessary for self driving' is unconvincing

> ridiculous idea by a certain set of people that lidar is necessary.

"Necessary"? Seems like a straw man, don't you think? I strive to argue against the strongest reasonable claim someone is making.

Lots of reasonable people suggest LIDAR is helpful to fill in gaps when vision is compromised, degraded, or less capable.

People running businesses, of course, will make economic trade-offs. That's fine. But don't confuse, say, Elon's economic tradeoff with the full explanation of reality which must include an awareness that different sensors have different strengths in different contexts.

So, when one thinks about what sensor mix is best for a given application, one would be wise to ask (and answer) such questions as:

- What is the quality bar?

- What sensors are available?

- Wow well do various combinations of sensors work across the range of conditions that matter for the quality bar?

- WRT "quality bar": who gets to decide "what matters"? The company making the cars? The people that drive them? regulators that care about public safety. The answer: it is a complex combination.

It is time to dismiss any claim (or implication) that "technology good, regulation bad". That might be the dumbest excuse for a philosophy I've ever heard. It is the modern-day analogue of "Brawndo's got what plants crave." Smart people won't make this argument outright, but unfortunately, their claims sometimes reduce to this level of absurdity. Neither innovation nor regulation are inherently good nor bad. There are deeper principles in play.

Yes, some individuals would use their self-proclaimed freedom to e.g. drive without seatbelts at 100 mph at night with headlights off. An extreme example, but it is the logical extension of pure individualism run amok. Regulators and anyone who cares about public safety will draw a line somewhere and say "No. Individual stupidity has a limit." Even those same people would eventually come to their senses after they kill someone, but by then it is too late.

Humans don't have explicit distance sensors either. When LIDAR sensors were $20k+ I think it made a lot of sense to avoid them.

It's not complicated. LIDAR hardware was in short supply during COVID. Elon obviously couldn't slow down production and sink the inflated stock price.

April 2019: https://www.youtube.com/live/Ucp0TTmvqOE?t=9220s

There are probably even earlier statements from him against lidar...

WTF was their calculus on the break-even liability point? The "if we do this, we save X amount of money, but stand to lose Y in lawsuits for cases where the usage of LIDAR could have otherwise prevented it."

All of driving is designed for visual.

TIL roads don't have rumble strips

I wouldn’t take too much issue with the “cameras are enough” claim if cameras actually performed like eyes. Human eyes have high dynamic range and continuous autofocus performance that no camera can match. They also have lids with eyelashes that can dynamically block light and assist with aperture adjustment.

The appeal to human biology and argument against fusion between disparate sensors kinda falls flat when you’re building a world model by fusing feeds from cameras all around the car. Humans don’t have 8 eyes in a 360 array around their head. What they do have is two eyes (super cameras) on ~180 degree swiveling and ~180 degree tilting gimbal. With mics attached that help sense other vehicles via road noise. And equilibrioception, vibration detection, and more all in the same system, all fused. If someone were actually building this system to drive the car, the argument based on “how did you drive here today?” gets a lot stronger. One time I had some water blocking my ear and I drove myself to the hospital to get it fixed. That was a shockingly scary drive — your hearing is doing a lot of sensing while driving that you don’t value until it’s gone.

Certain company has 300k subscribers that rely on that ridiculous service.

My father lost vision in 1 eye and 50% in other one something like 20 years ago. He struggles in parking but otherwise doing ok without lidar. Turns out motion vision is more accurate after 10-20 meters than stereoscopic vision.

One camera can't really produce depth/distance information, but two cameras sure can. The eyes in your head don't capture distance information individually, but with two eyes you can infer distance.

You're forgetting the nervous system and the brain connected to those eyes (and vestibular system).

Why would you assume I "forgot" about any of that? It's implied. That's what "infer" means in that sentence. Of course it requires a brain and nervous system. Maybe you don't know what the word "infer" means?

That fake indignation doesn't change the fact that you equated two cameras with two eyes, and if we're going that hard on semantics, you used the word produce for the cameras.

> One camera can't really produce depth/distance information, but two cameras sure can.

I'll preface by saying lidar should be used with autonomous vehicles.

Individual cameras don't have distance information, but you can easily calibrate a system of cameras to give you distance information. Your eyes do this already, albeit not quantitatively. The quantitative part comes from math our brains aren't setup to do in real time.

It was cost wasn't it?

If this lowers Lidar costs, and Tesla has spent all this time refining the camara technology. Now have both.

Use both.

It was a great decision to drop LiDAR. The cars are running excellently without it

[dead]

I find it comical that people continue to go back to this rage well against "a certain company" for their vision-only approach when the truth is they have the best automatic driving system an individual can buy, rivaling Waymo and beating the Chinese brands.

Why are the commenters not pissed at the dozens of other car companies who have done absolutely nothing in this space? Answer: because it's not nearly as fun to be pissed at Kia or Mercedes or whoever. Clearly they are just enjoying the shared anger, regardless of whether it is justified.

Because other car companies don't have CEOs who've been super confident about predicting actual full self driving either "this year" or "next year" for the past decade. If Ford had been swearing up and down they'd have full self driving cracked any day now for ten years, and been charging people for the hardware along the way, everyone would be pissed at them too.

Surely you already know this, so why pretend otherwise?

1. Tesla is not competitive with Waymo, they're not even in the same class. Waymo is 10 years ahead at least. I understand you can't buy a Waymo, but still.

2. Other car companies are properly valued, Tesla is overinflated.

3. Other cars, even basic Hondas, have the same level of self driving as Teslas.

4. Other car companies don't lie to their customers about their capabilities or what they're buying.

> Other cars, even basic Hondas, have the same level of self driving as Teslas.

This is not true at all. Don't confuse lane assist with self driving. And yes I'm aware people are upset by the "Autopilot" product name they chose for lane assist.

You're way off if you think that Waymo and FSD are anywhere close.

There is certainly some truth that "some company" overpromised and underdelivered. They advertise "full self driving" but then hide in the fine-print that "oh jk, not really, but its still full self driving if anyone asks ;) ;) ;)"

I think the frustration stems from the obvious falsehoods in the advertising, and the doubling-down on the tech, despite the well-documented weaknesses of the implementation.

Have you driven in Tesla FSD recently? If anything it’s undersold. It’s an absolute miracle. I use it everyday.

Please be courteous to other drivers on the road, we all share it. Just make sure you’re the one in charge, not the software. This isn’t to put your argument down, but to offer the perspective of people involved in accidents. Loss of life is bad, but surviving accidents is also equally bad.

Why make things more complicated than they need to be? Humans don't have lidar and we are the only intelligence that can reliably drive. Lidar just seems like feature engineering, which has proven to be a dead end in most other AI applications (bitter lesson).

https://www.cs.utexas.edu/~eunsol/courses/data/bitter_lesson...

> Why make things more complicated than they need to be? Humans don't have lidar and we are the only intelligence that can reliably drive.

Because we want self driving cars to be safer than human driven cars.

If humans had built in lidar we would use it when driving.

Read the comment again. It's not that vision is "good enough", it's that feature engineering doesn't work

Self driving cars are not equipped with human brains so this doesn’t really make sense.

“We should achieve self driving cars via replicating the human brain” strikes me as an incredibly inefficient and difficult way to solve the problem.

Then you deeply underestimate how difficult the problem is, and deeply misunderstand where all the effort has been spent in developing autonomous vehicles.

If all the effort has been spent in trying to replicate the human brain then I am comfortable saying that is a mistake.

We have a tool that can tell with great accuracy how far away an object is. The suggestion that we should ignore it and rely on cameras that have to guess it because “that’s how humans work” is absurd, frankly.

> we are the only intelligence that can reliably drive.

Science would like to point out that rats also can learn to drive

https://theconversation.com/im-a-neuroscientist-who-taught-r...

yeah but not reliably, they often totally space on their commitments to pick you up from the airport, etc

If you had to choose between picking someone up at the airport or dragging a slice of pizza twice your size down the NYC subway stairs, what would YOU do?

The bitter lesson I think is a great way of explaining the logic behind Tesla's strategy. People aren't getting it.

Whether or not it'll actually work remains to be seen, but it's a perfectly reasonable strategy. One counterargument would be that the bitter lesson can be applied to LIDAR too; you don't have to use that data for feature engineering just because it seems well suited for it.

Humans can drive with eyes only, but we are better drivers when we can also use other senses like hearing. If humans has lidar we would use it when driving.

Don't cars already use a ton of sensors that don't reproduce human senses and ways of doing things?

This knee-jerk reply is old and tired, and the counterarguments are well-trod at this point. Even if cameras-only can build a car that’s as good as humans, why should we settle for “as good as“ humans, who cause 40,000 fatalities a year in the US? If we can do better than humans with more advanced sensors, we are practically morally obligated to do that.

Yes! The smart and nuanced panoply of replies to the GP are a wonderful counterbalance to people "just saying things that pop into their head" -- which is unfortunately how I view a lot of human speech nowadays :/