What is the actual use of this?

This research team used Google's first-party location data to identify San Jose's Interstate 880/US 101 interchange as a site with statistically extreme amounts of hard braking by Android Auto users.

But you don't need machine learning to know that... San Jose Mercury News readers voted that exact location as the worst interchange in the entire Bay Area in a 2018 reader poll [1]

It's not a lack of knowledge by Caltrans or Santa Clara County's congestion management agency that is keeping that interchange as-is. Rather, it's the physical constraints of a nearby airport (so no room for flyovers), a nearby river (so probably no tunneling), and surrounding private landowners and train tracks.

Leaving aside the specifics of the 880/101 interchange, the Google blog post suggests that they'll use this worst-case scenario on a limited access freeway to inform their future machine-learning analyses of other roads around the country, including ones where presumably there are also pedestrians and cyclists.

No doubt some state departments of transportation will line up to buy these new "insights" from Google (forgetting that they actually already buy similar products from TomTom, Inrix, StreetLight, et al.) [2]

While I genuinely see the value in data-informed decision making for transportation and urban planning, it's not a lack of data that's causing problems at this particular freeway intersection. This blog post is an underbaked advertisement.

[1] https://www.mercurynews.com/2018/04/13/101-880-ranks-as-bay-...

[2] https://www.tomtom.com/products/traffic-stats/ https://inrix.com/products/ai-traffic/ https://www.streetlightdata.com/traffic-planning/

> What is the actual use of this?

From the article:

"Our analysis of road segments in California and Virginia revealed that the number of segments with observed HBEs was 18 times greater than those with reported crashes. While crash data is notoriously sparse — requiring years to observe a single event on some local roads — HBEs provide a continuous stream of data, effectively filling the gaps in the safety map."

So we don't have to wait until an accident actually occurs before we can identify unsafe roads and improve them.

Actual use: autonomous vehicles know to be "more careful" here, perhaps to do "Jersey left" turns - right turn and U-Turn or other risk compensation strategies.

I'd love to see them incorporate visual detection of vehicle crash debris as well. There are two intersections in my area that consistently have crash debris like broken window glass and broken plastic parts and license plates from crashes. I know they are dangerous, but I don't know if autonomous vehicles also know that they are dangerous.

>No doubt some state departments of transportation will line up to buy these new "insights" from Google (forgetting that they actually already buy similar products from TomTom, Inrix, StreetLight, et al.) [2]

Google/Apple probably collect a massively larger amount of data than those other companies, putting those other companies at a risk of losing future revenue.

Between Google and Apple pretty much every car in the US is monitored.

Yeah, Google and Apple do probably have much more first-party probe data of passenger vehicles. But it really depends on the type of traffic data product. For some use-cases, it's more than sufficient for the vendor to buy probe data from specific types of fleet vehicles (like work trucks).

Where Google/Apple's coverage is quite valuable is for near-real-time speeds for atypical events -- say like yesterday's Super Bowl. But that's not what this blog post is about -- this post is about a well-established pattern that can be identified with historical datasets.

All that to say that vendors sell a wide variety of data products to transportation planners, but just because Google is now entering this niche market doesn't mean they'll be "the best" or even realize what their strengths are.

It does look much more like a revenue play. The data already exists, but not from the conglomorates and not as uniformly formatted.

Caltrans could lower the speed at that interchange, and use traffic calming to actually get people to drive slower. Good traffic engineering can still make a difference even with the existing physical limitations.

Absolutely nothing in this research suggests machine learning. All this is saying is that the hard braking events are associated with dangerous road segments that are well-known by other measures (in this case, reported crash rates).

On the interchange in question, they can always redo how the merge is designed in the same space. There is no excuse for that.

Please see:

> It's not a lack of knowledge by Caltrans or Santa Clara County's congestion management agency that is keeping that interchange as-is. Rather, it's the physical constraints of a nearby airport (so no room for flyovers), a nearby river (so probably no tunneling), and surrounding private landowners and train tracks.

The most recent budget estimate is $1bn for any changes to this interchange

Indeed why would you even need this or a poll? The crash statistics already exist. What's the purpose of a proxy predictor unless the label is something too low signal to detect but may become a big issue later. The only such case is a new road that recently opened.