The purpose of lidar is to prove error correction when you need it most in terms of camera accuracy loss.

Humans do this, just in the sense of depth perception with both eyes.

Human depth perception uses stereo out to only about 2 or 3 meters, after which the distance between your eyes is not a useful baseline. Beyond 3m we use context clues and depth from motion when available.

Thats going off just the focal point.

We do a lot more internal image processing. For example, relative motion as seen by either eye helps improve accuracy by a whole lot, in the "medium" distance range.

Thanks, saved some work.

And I'll add that it in practice it is not even that much unless you're doing some serious training, like a professional athlete. For most tasks, the accurate depth perception from this fades around the length of the arms.

ok, but a care is a few meters wide, isn't that enough for driving depth perception similar to humans

The depths you are trying to estimate are to the other cars, people, turnings, obstacles, etc. Could be 100m away or more on the highway.

ok, but the point trying to be made is based on human's depth perception, but a car's basic limitation is the width of the vehicle, so there's missing information if you're trying to figure out if a car can use cameras to do what human eyes/brains do.

Humans are very good at processing the images that come into our brain. Each eye has a “blind spot” but we don’t notice. Our eyes adjust color (fluorescent lights are weird) and the amount of light coming in. When we look through a screen door or rain and just ignore it, or if you look outside a moving vehicle to the side you can ignore the foreground.

If you increase the distance of stereo cameras you probably can increase depth perception.

But a lidar or radar sensor is just sensing distance.

Radar has a cool property that it can sense the relative velocity of objects along the beam axis too, from Doppler frequency shifting. It’s one sense that cars have that humans don’t.

To this point, one of the coolest features Teslas _used_ to have was the ability for it to determine and integrate the speed of the car in front of you AND the speed of the car in front of THAT car, even if the second car was entirely visually occluded. They did this by bouncing the radar beam under the car in front and determining that there were multiple targets. It could even act on this: I had my car AEB when the second ahead car slammed on THEIR brakes before the car ahead even reacted. Absolutely wild. Completely gone in vision-only.

The width of your own vehicle is (pretty much) a constant, and trivial to know. Ford F150 is ~79.9 inches. Done. No sensors needed.

All the shit out there in the world is another story.

You misundestood the assignment.

Write a sonnet about Elon musk.

The company I used to work for was developing a self driving car with stereo depth on a wide baseline.

It's not all sunshine and roses to be honest - it was one of the weakest links in the perception system. The video had to run at way higher resolutions than it would otherwise and it was incredibly sensitive to calibration accuracy.

(Always worth noting, human depth perception is not just based on stereoscopic vision, but also with focal distance, which is why so many people get simulator sickness from stereoscopic 3d VR)

> Always worth noting, human depth perception is not just based on stereoscopic vision, but also with focal distance

Also subtle head and eye movements, which is something a lot of people like to ignore when discussing camera-based autonomy. Your eyes are always moving around which changes the perspective and gives a much better view of depth as we observe parallax effects. If you need a better view in a given direction you can turn or move your head. Fixed cameras mounted to a car's windshield can't do either of those things, so you need many more of them at higher resolutions to even come close to the amount of data the human eye can gather.

Easiest example I always give of this is pulling out of the alley behind my house: there is a large bush that occludes my view left to oncoming traffic, badly. I do what every human does:

1. Crane my neck forward, see if I can see around it.

2. Inch forward a bit more, keep craning my neck.

3. Recognize, no, I'm still occluded.

4. Count on the heuristic analysis of the light filtering through the bush and determine if the change in light is likely movement associated with an oncoming car.

My Tesla's perpendicular camera is... mounted behind my head on the B-pillar... fixed... and sure as hell can't read the tea leaves, so to speak, to determine if that slight shadow change increases the likelihood that a car is about to hit us.

I honestly don't trust it to pull out of the alley. I don't know how I can. I'd basically have to be nose-into-right-lane for it to be far enough ahead to see conclusively.

Waymo can beam the LIDAR above and around the bush, owing to its height and the distance it can receive from, and its camera coverage to the perpendicular is far better. Vision only misses so many weird edge cases, and I hate that Elon just keeps saying "well, humans have only TWO cameras and THEY drive fine every day! h'yuck!"

> owing to its height and the distance it can receive from,

And, importantly, the fender-mount LIDARs. It doesn't just have the one on the roof, it has one on each corner too.

I first took a Waymo as a curiosity on a recent SF trip, just a few blocks from my hotel east on Lombard to Hyde and over to the Buena Vista to try it out, and I was immediately impressed when we pulled up the hill to Larkin and it saw a pedestrian that was out of view behind a building from my perspective. Those real-time displays went a long way to allowing me to quickly trust that the vehicle's systems were aware of what's going on around it and the relevant traffic signals. Plenty of sensors plus a detailed map of a specific environment work well.

Compare that to my Ioniq5 which combines one camera with a radar and a few ultrasonic sensors and thinks a semi truck is a series of cars constantly merging in to each other. I trust it to hold a lane on the highway and not much else, which is basically what they sell it as being able to do. I haven't seen anything that would make me trust a Tesla any further than my own car and yet they sell it as if it is on the verge of being able to drive you anywhere you want on its own.

In fact there are even more depth perception clues. Maybe the most obvious is size (retinal versus assumed real world size). Further examples include motion parallax, linear perspective, occlusion, shadows, and light gradients

Here is a study on how these effects rank when it’s comes to (hand) reaching tasks in VR: https://pubmed.ncbi.nlm.nih.gov/29293512/

Actually the reason people experience vection in VR is not focal depth but the dissonance between what their eyes are telling them and what their inner ear and tactile senses are telling them.

It's possible they get headaches from the focal length issues but that's different.

I keep wondering about the focal depth problem. It feels potentially solvable, but I have no idea how. I keep wondering if it could be as simple as a Magic Eye Autostereogram sort of thing, but I don't think that's it.

There have been a few attempts at solving this, but I assume that for some optical reason actual lenses need to be adjusted and it can't just be a change in the image? Meta had "Varifocal HMDs" being shown off for a bit, which I think literally moved the screen back and forth. There were a couple of "Multifocal" attempts with multiple stacked displays, but that seemed crazy. Computer Generated Holography sounded very promising, but I don't know if a good one has ever been built. A startup called Creal claimed to be able to use "digital light fields", which basically project stuff right onto the retina, which sounds kinda hogwashy to me but maybe it works?

My understanding is that contextual clues are a big part of it too. We see a the pitcher wind up and throw a baseball as us more than we stereoscopically track its progress from the mound to the plate.

More subtly, a lot of depth information comes from how big we expect things to be, since everyday life is full of things we intuitively know the sizes of, frames of reference in the form of people, vehicles, furniture, etc . This is why the forced perspective of theme park castles is so effective— our brains want to see those upper windows as full sized, so we see the thing as 2-3x bigger than it actually is. And in the other direction, a lot of buildings in Las Vegas are further away than they look because hotels like the Bellagio have large black boxes on them that group a 2x2 block of the actual room windows.

> Humans do this, just in the sense of depth perception with both eyes.

Humans do this with vibes and instincts, not just depth perception. When I can't see the lines on the road because there's too much slow, I can still interpret where they would be based on my familiarity with the roads and my implicit knowledge of how roads work, e.g. We do similar things for heavy rain or fog, although, sometimes those situations truly necessitate pulling over or slowing down and turning on your 4s - lidar might genuinely given an advantage there.

That’s the purpose of the neural networks

Yes and no - vibes and instincts isn't just thought, it's real senses. Humans have a lot of senses; dozens of them. Including balance, pain, sense of passage of time, and body orientation. Not all of these senses are represented in autonomous vehicles, and it's not really clear how the brain mashes together all these senses to make decisions.

[deleted]

Another way humans perceive depth is by moving our heads and perceiving parallax.

How expensive is their lidar system?

Hesai has driven the cost into the $200 to 400 range now. That said I don't know what they cost for the ones needed for driving. Either way we've gone from thousands or tens of thousands into the hundreds dollar range now.

Looking at prices, I think you are wrong and automotive Lidar is still in the 4 to 5 figure range. HESAI might ship Lidar units that cheap, but automotive grade still seems quite expensive: https://www.cratustech.com/shop/lidar/

Those are single unit prices. The AT128 for instance, which is listed at $6250 there and widely used by several Chinese car companies was around $900 per unit in high volume and over time they lowered that to around $400.

The next generation of that, the ATX, is the one they have said would be half that cost. According to regulator filings in China BYD will be using this on entry level $10k cars.

Hesai got the price down for their new generation by several optimizations. They are using their own designs for lasers, receivers, and driver chips which reduced component counts and material costs. They have stepped up production to 1.5 million units a year giving them mass production efficiencies.

That model only has a 120 degree field of view so you'd need 3-4 of them per car (plus others for blind spots, they sell units for that too). That puts the total system cost in the low thousands, not the 200 to 400 stated by GP. I'm not saying it hasn't gotten cheaper or won't keep getting cheaper, it just doesn't seem that cheap yet.

[dead]

Waymo does their LiDAR in-house, so unfortunately we don’t know the specs or the cost

We know Waymo reduced their LiDAR price from $75,000 to ~$7500 back in 2017 when they started designing them in-house: https://arstechnica.com/cars/2017/01/googles-waymo-invests-i...

That was 2 generations of hardware ago (4th gen Chrysler Pacificas). They are about to introduce 6th gen hardware. It's a safe bet that it's much cheaper now, given how mass produced LiDARs cost ~$200.

Otto and Uber and the CEO of https://pronto.ai do though (tongue-in-cheek)

> Then, in December 2016, Waymo received evidence suggesting that Otto and Uber were actually using Waymo’s trade secrets and patented LiDAR designs. On December 13, Waymo received an email from one of its LiDAR-component vendors. The email, which a Waymo employee was copied on, was titled OTTO FILES and its recipients included an email alias indicating that the thread was a discussion among members of the vendor’s “Uber” team. Attached to the email was a machine drawing of what purported to be an Otto circuit board (the “Replicated Board”) that bore a striking resemblance to – and shared several unique characteristics with – Waymo’s highly confidential current-generation LiDAR circuit board, the design of which had been downloaded by Mr. Levandowski before his resignation.

The presiding judge, Alsup, said, "this is the biggest trade secret crime I have ever seen. This was not small. This was massive in scale."

(Pronto connection: Levandowski got pardoned by Trump and is CEO of Pronto autonomous vehicles.)

https://arstechnica.com/tech-policy/2017/02/waymo-googles-se...

Less than the lives it saves.

Cheaper every year.

Exactly.

Tesla told us their strategy was vertical integration and scale to drive down all input costs in manufacturing these vehicles...

...oh, except lidar, that's going to be expensive forever, for some reason?