I used to joke that I worked for the last DotCom startup, a company that got a funding round after the shit hit the fan.
They were working on an idea that looked a bit like an RSS feed for an entire website, where you would run your own spider and then our search engine could hit an endpoint to get a delta instead of having to scan your entire site.
If they’d made the protocol open instead of proprietary, we maybe could have gotten spiders to play nicer since each spider after the first would be cheaper, and eventually maybe someone could build pub sub hooks into common web frameworks to potentially skip the scan entirely for read-mostly websites, generating delta data when your data changed.
But of course when the next round of funding came due nobody was buying.
I thought about this a lot on my last project, where spiders were our customers’ biggest users. One of those apps where customer interactions were intense but brief and the rank in Google mattered equally with all other concerns. Nobody had architected for the actual read/write workflow of the system of course, and that company sold to a competitor after I left. Who migrated all customers to their solution and EOLed ours for being too fat in a down economy.