Nilay Patel has been talking about "Google Zero" - the moment when Google effectively stops sending any traffic to other sites - for a few years now: https://www.theverge.com/24167865/google-zero-search-crash-h...

Which as some running a website raises a fascinating question. If Google is just going to crawl my sites and present information as an AI summary on their site, then what exactly do I gain by allowing Googlebot to crawl my sites?

A couple of years back I worked with a company which maintained specific data which was the main traffic driver on that page. Google approached them and wanted to pay for the rights to get the data and display it on top of the search results, a feature which was fairly new back then.

This was an interesting dilemma because it was very clear that the money was way less than the loss in ad revenue due to traffic drop, but it was also clear that if we wouldn’t take the deal, a more desperate competitor would, which would result in the same traffic loss but without the extra google money. So the company took the deal.

History repeats itself here, with the difference that instead of paying for the data, the ai crawlers simply take it for free.

A similar dilemma presents itself when blocking AI spiders.

You’re free to block them, but the websites cloning your content won’t. So either way they’ll get the content they’re after.

Worse, when/if the time comes that LLMs source their claims they’ll refer you to the websites that cloned your content.

This reminds me of Walmarts squeezing strategy with all the manufacturers. Business with us at the price we say or out of business.

But ultimately that strategy is good for the consumer right?

In this context, if Google is going to give me the recipe without having to scroll through the story, that seems like a win to me.

The ad-revenue driven Internet of web 2.0 is finally dead and I'm not sure I'm all that sad.

But google won’t give you the recipe. It’ll give you a pretty piece of text that resembles a recipe. You’ll only find out it’s not a recipe when it fails to produce a cake.

But then, the sites its training on are starting to do the same thing, so maybe it won’t matter. Just last night, I pulled up four sites with “gluten free almond cake” recipes. One specified less than 1/4 the flour it would have needed, and another didn’t have any butter in the ingredients list. I had to eyeball the median and tweak from experience to actually get a bakable cake out.

You can buy recipe books at brick and mortar stores, and they're mostly not AI slop yet.

Without some way to generate revenue, people aren't going to publish recipes (for Google to scrape into their AI.) Maybe we could live without more recipes being fed into the machine, but there are many other types of content that will suffer the same fate.

It would be nice to find something better than an ad-revenue driven web, but I'm not sure this is it. We'll find out I guess...

> people aren't going to publish recipes

Sure they are. I can attest that musicians will gladly publish their music even when no recompense is offered. Surely culinary artists are the same.

Some people will. The vast majority won’t. You are not playing for an audience, only for crawlers, so what is the point?

We just won't get countless recipe websites where you have to scroll, scroll, scroll through slop about someone's day to read a scraped recipe that every other website has.

This is just disruption.

Does the current Google search results indicate that they will be any different?

It’s also not disruption if your product relies on the output of the industry it kills. What will AI train on when it destroys the economics of sharing information with others?

> But ultimately that strategy is good for the consumer right?

No because it's killing competition and becoming an even more obvious monopoly. Then at any occasion they have to choose between consumers and profits, they'll do what shareholders want and increase profits.

You can find many ad free recipes in the cookbooks at your local library. They're likely of higher quality as well.

> But ultimately that strategy is good for the consumer right?

No. Temporarily it’s good for the consumer. Ultimately it is bad for the consumer, because as prices drop, so to does quality.

It's not uncommon for free things to be higher quality than cheap things, especially when we're not talking about physical goods. Think hobbyist vs hack. Selective pro bono vs quantity over quality. The former describes old internet while the latter describes much ad-supported internet. I'm not saying cheap is better than expensive, and I'm not saying everything works this way, but I do think many things do, especially for pure information that doesn't have a major capital cost associated.

No, because now Google controls entirely what you see. They could decide to show you the recipe after all.

Also, at some point even the ad-laden websites will die, and then Googles sources will be extinguished.

Yep, this is exactly why some companies simply don't work with Walmart.

it's because both google and walmart have too much market power

This is tough for the manufacturer, but great for the consumer.

I think it's a good tradeoff.

Real-world Prisoner's Dilemma.

It always comes back to game theory haha

[dead]

That doesn't feel like a repetition at all? You said that the first time there was "traffic loss but without the extra google money", but that this time there's no extra google money either way.

The part where data providers lose traffic because their own data is displayed directly on the google premises is what repeats.

"Nice data you got there, it'd be a shame if something were to happen to it"

The fact is that internet is already "tech giants own realm": the power is way beyond public imagination and affects all of us in real life on daily bases, but there are still people thinking they are not the "evil one" here.

It's a catch-22. Without google crawling your site, you don't get any new traffic. But with google crawling your site, you also might not get any traffic.

AI summarization has already causes issues for sites like rtings where people are no longer visiting the site but still making use of the data presented there. Leading to rtings not getting enough traffic to continue to post their data.

It is an existential crisis for websites and when they go away it'll be an existential crisis for AI.

Step 1, Google serves info directly and consumers rejoice

Step 2, Google extinguishes the web and nobody has a reason to publish content, consumers lament but are trapped, Google has created a platform to serve content instead of links

Step 3 (or maybe 2a), Google is now monetizing their content machine

Step 4, Google offers people a way to contribute to the content machine, make some $$ per N views, whatever. People create content within the ecosystem

Step 5, Google is now the internet, more content is created overall, quality is lower overall perhaps, algorithmic echo chambers flourish even more than today, old heads on HN lament, everyone else just goes on living

> Without google crawling your site, you don't get any new traffic. But with google crawling your site, you also might not get any traffic.

I may be strange and unusual, but I just have never cared about my Google ranking. I know this makes me out of the ordinary among site owners but I have been humming along fine.

This certainly will disrupt traffic but for some of my sites I honestly think this is a good thing. I want you to want to be there, not just stumble upon my site because you happen to hit the right search keyword. Plus if it gets bad, this does create a new opportunity for others with cross linking and search.

Only issue is what happens when the company that owns the search and has a dominant share of the browser market flags your site with the good old "warning: potential risk ahead" when people try to reach it directly? And buries the "I know the risk let me through" deep in the browser settings. Advocate for different browsers? Google is pushing web attestation in one form or the other. I wish the future would look bleak, because right now it's looking blue, red, yellow and green and it's worse.

> Only issue is what happens when the company that owns the search and has a dominant share of the browser market flags your site with the good old "warning: potential risk ahead" when people try to reach it directly?

My target market is more technical then that so likely, nothing would change for me. Again, I recognize the impact of Google's dominance for some, but if the "attestation" isn't helpful and only hinders using services that people have come to rely on, there will be push back.

I also have been advocating for years for everyone in my circle to avoid using Chrome. A homogenized browser market is a risk, and Chrome is the new IE. I hope you are also a part of the effort to advocate for browser diversity.

You could lose your domain name: https://news.ycombinator.com/item?id=47151233

> I have been humming along fine.

Do you depend on site visitors for making a living? That's what this is about.

Yes! However most of my users were established through my network, not search.

I know that sites relying on ad income will and are being hurt tremendously by this effort on Google's part. However, if you are in the startup space and make money on services you offer, search should be one of several strategies you are deploying for user growth.

This is like saying people should just win the lottery or something. Your conditions are extremely niche

> AI summarization has already causes issues for sites like rtings

Isn't Stack Exchange the emblematic case?

Stack Exchange committed suicide by closing all the questions. It was already in a steep decline before LLMs, after it got bought by private equity and did things like firing the moderators.

> Leading to rtings not getting enough traffic to continue to post their data.

And here I thought denying ad revenue to websites was the morally superior way to navigate the web...

That's some catch, that catch-22.

It's the best there is.

I see everything twice!

> Without google crawling your site, you don't get any new traffic

What about the stories of marketing managers who learned months after the fact that their credit card had expired and their google ad spend had ceased with no affect on traffic? Google isn't always an effective promotional vehicle.

What stories are they?

Sounds like a pretty ineffective manager: wasn’t buying the correct ad placement in the first place, used a personal card to sign up for an ostensibly corporate service, didn’t keep track of expiration dates for the card, and was also ignoring email notifications from Google about the expired card. Let me know if I’m missing any other reasons why this manager should be fired instantly.

Most large corporations have company credit cards. The user is likely referring to his card being the company card.

Nah. Pretty sure he was using a personal card - corporate finance dept would likely better track where each card is being used and their expiration dates to avoid this happening. Also this better tracks with the rest of his sloppy behavior.

Well in addition to what you wrote, the marketing manager ALSO wasn't tracking any ad-related marketing performance indicator (CTR, CR, etc.) in any measurable way for very long periods of time... or they would have caught it almost immediately ("wow ad spend, CTR and CR have all suddenly gone down to 0/0% and have been staying there for days on all our campaigns! What's up with that?").

Internet is more and more becoming a commercialization platform. If you are selling something on your website, you still want Google (or ChatGPT for that matters) to expose customers to your product. The gate is the actual delivery of the product is behind a purchase/signup. Google and others want to control the entire customer journey, to the point the your website is simply a way to pass metadata to them. They are actually achieving this!

this kills the entire internet vibe of the 90s, early 2k

> is more and more becoming a commercialization platform

FTFY: "couple of decades since has become". The vibes of passion-driven 1990s started to be overwhelmed by the din of money right when the Internet has become a major commerce venue, some time in early 2000s.

You're allowed to exist on the web. The alternative is you are pushed out, your site is not indexed and google / chrome labels it as a security risk when people are trying to reach it directly. The mandate is clear: give up the data or give up the spot.

Sites pay good money to appear on top search results. Looks like the future is going to be sponsored AI sources. It's going to be even more difficult to figure out if google is presenting you with actual information instead of just an ad

If your site is all about disseminating information (like Wikipedia), then Google would provide a free mirror of sorts.

If your site is about your product, Google won't be able to serve the sign-up page from AI; the traffic would come your way. Same for a site that sell something: the traffic you're interested in would arrive at your checkout page.

Paid-content sites and ad-supported sites are screwed though, on top of their being screwed by archive.is and ad blockers.

The really confusing part about the ad-supported sites is that most of them are supported by Google's ad products. So Google is eating their own lunch here.

Search Engine Result Page (SERP) ads shown on Google itself are far more valuable than display ads that get shown on random websites. Google has been slashing payouts to those sites for over a decade. More recentlt, they've been slashing search impressions to those sites as well. With search engine ads + Youtube ads + Play Store ads, they can probably cut out the third - party site ads business altogether and not miss a beat.

I write things on the internet because I want to share ideas. If someone reads my post and tells a friend, that's great. If an AI crawls my posts and passes along the ideas that's great too.

(It doesn't work for ad-funded writing, but while I have substantial sympathy there this has historically been an unpopular argument on HN)

Sure but this means that you’re no longer eligible to make living from your ideas, which can be fine by you but it eliminates entire class of people who used to make living from intellectual work.

This also could have been fine, it can bring back authenticity however for this to happen no one should be making money from it. Instead, only megacorps make money and they can just ignore your ideas and generate theirs. They control the distribution and the supply now.

Not making a living from ads specifically, sure, but many have things like Substack which actually directly incentivizes them to make good content rather than serving ads.

Setting aside ad-driven revenue - the ideas, when spat out by an llm, are disconnected from the author. If people like your ideas, they aren’t becoming fans/followers/long-term-readers. That means good luck leveraging some interesting writing into a book, a speaking tour, a podcast, or even any kind of consistent readership. The llm slurps up your content and monetizes it while you get nothing.

I'm not interested in a book, speaking tour, or podcast. I've never had consistent readership because I write about too many unrelated things. I blog because I have ideas I want to share; I don't feel at all ripped off.

And do you think that’s how everyone should feel? If not, how is it relevant to people not liking what Google is doing?

Fair enough, sounds like you won’t be impacted. But the vast majority of people i read online are able to write the content i enjoy because there are paths to earn a living off it. I expect the future of llm search will leave only hobbyists and slop producers standing.

That's Google making way for its disruptor. We'll see who that is. Imagine a search engine that just presents search results. Groundbreaking.

More likely you're going to get a search engine which returns results as short 5 second AI generated video clips with an infinite scroll.

(Torment Nexus rules apply here)

What you gain? Nothing, but they and other AI companies have decided not to respect your robots.txt

There are other ways to block robots from crawling our sites. I have a robots.txt but place no faith in it, it’s just there because it’s cheap and does stop some of the crawlers.

The expected purpose of websites is to spread information, so whether users get it by making a request to your website or to Google is irrelevant. In fact, if they get it from Google it's better because it reduces website load.

If instead the purpose of your website is to manipulate users for financial gain (for instance by showing media attempting to manipulate their purchasing decisions, after receiving a bribe from a vendor), and the information is just a way to lure users, then maybe this malicious business model will finally be no longer possible.

Free speculation: I could see a future where Google populates a footer on results with the website logos of the sources. Presumably, the new web will require users to memorize websites/brands and go directly to those sites if they see a lot of their results are being provided by one source.

Websites may go back to being simply labors of love.

> Websites may go back to being simply labors of love.

The situation may be even worse. Back in the labor of love era, at least webmasters could get feedback from readers. In the LLM era, readers may not even know that the site exists. Without feedback/community, the overall quality of those sites will decrease over time.

>I could see a future where Google populates a footer on results with the website logos of the sources.

ChatGPT/Claude does this today. I barely click or care for the source when they already have me the info I wanted.

My speculation is all information worth anything is going to be behind some kind of wall.

> ChatGPT/Claude does this today. I barely click or care for the source when they already have me the info I wanted.

Maybe I'm just #builtdifferent, but I click these a lot. Especially if I'm trying to research or make a decision on something, I want the actual source and not the potentially-fudged summary.

I click those all the time if it is something that matters and I wanna verify that the AI got it correctly.

Seems like a great way to end up knowing no real information and with no ability to analyse literature or think for yourself?

Not to mention the hallucinations

Google's AI summaries already do this. I occasionally click through to see the underlying source the AI summary leaned on to generate the response, but probably only ~20% of the time.

It seems like they should have a model similar to YouTube. If I watch a video on YouTube made by someone, they get a little cash, and it ads up.

Similarly, if I use Gemini uses a website for an answer, it should pay something to those sites for the information it gathered. Sites would need to sign up to earn via Google, and I'd imagine there would be a certain threshold to cross to make it worth cutting checks... but that would make all these AI search tools feel much less scummy while providing site owners an incentive to keep sharing information on the internet.

Where a model like this would get messy is with sites like reddit. It's a very popular source for AI search, but the value comes from the users, not the platform itself.

Actually it cannot work this way, content creators make far more money from ads in the video itself compared to the one yt gives them. If it were for yt money alone basically we will still be in the 2010 yt: folks that doing it just for fun.

The problem with all this AI/llm stuff is that end users doesn't even know your tiny site with a lot of useful information exists at all.

> The problem with all this AI/llm stuff is that end users doesn't even know your tiny site with a lot of useful information exists at all.

This depends on implementation. I primarily use Kagi for any LLM stuff. I cites pretty much everything and links out to the source. I regularly use this for search. The normal search results may not have what I need, but a line in the AI results sounds better and I click through to the source to get more context.

I find clicking through to the source is important, as I've often seen the AI get it wrong. The page has what I need on it, but the AI grabbed the wrong thing and got it backward. I'm probably in the minority, I'm guessing most people don't use LLMs like this.

I rely far more on bookmarks and memorised URLs now.

well its already happening and people are fighting over traffic crumbs already, they call it GEO

Maybe you want your ideas to spread? If your sites purpose is getting ad impressions then yea no point. But if your purpose is to spread ideas then it is still useful.

> allowing Googlebot to crawl my sites

As far as I know, you don't have a choice. They have no obligation to respect your wishes, and LLMs are legally allowed to scrape & republish your content.

> They have no obligation to respect your wishes

I have no obligation to not send all scraper-looking traffic to a black hole full of zip bombs.

There's always poison fountain - deliberately wrong source code.

You do have an obligation because what you are describing is illegal, at least in the US under the CFAA.

Okay, nix the zip bombs. What's my obligation to treat bot-shaped traffic as something I should reply to?

Spreading malware to your website's visitors is wild and illegal in most jurisdictions. I certainly wouldn't confess about it online.

Malware? It's just a large file. A very, very large file.

But fine. How about I just...don't respond to those requests at all. I have no obligation to send them data period.

Is AI a visitor or malware? It certainly steals paid resources (bandwidth).

Disclaimer: his website is for hosting malware for "testing" purposes. Testing how well AI can't deal with it.

except google does respect robots.txt so you do have a choice?

still respects robots.txt

Vastly less but still more traffic than if you didn’t participate. I’m sure they will calibrate it just so.

Websites tend to be updated and considered to be the source as well.

(You misspelled someone as some)

Google has always crawled your site and been an arse! Now you get to decide whether they are hallucinating!

You can drop pointers on Masto and other socials to your sites - that has not changed.

Do we need something else? ie you drop a link to somewhere else.

Can you actually prevent Google from crawling your site?

> then what exactly do I gain by allowing Googlebot to crawl my sites?

Mention

It's worse than that. They train their models preferentially on what they consider to be high-quality data. But if you look at the usual "references" on search queries, they're often just a post-hoc BS justification that links to spam blogs or Tiktok videos.

> what exactly do I gain by allowing Googlebot to crawl my sites?

Site traffic

Allow? Deep down, do you think you have a choice?

Mechanisms might exist to make you think you have one, the same way copywrite should prevent millions of books being gobbled up by TheZuck but ultimately do you really have a choice?

Rules and laws don't exists for you.

Yes, Google advertises its crawler IP ranges and it is quite easy to keep track of this and block them. But only if you control the infrastructure that your site runs on of course.

stego your site, google sees the red herring version, intended users see the payload.

this has been done before, quite often, but toward ends morally askew.

It has reduced traffic to my website by around 65%. I live from that website. My income is a function of the traffic it gets.

I spent 9 years of my life putting hard-earned information on the internet, and now big tech uses it to enrich themselves while putting me out of work. Even my backup plan - software development - is being devalued to hell. It's so damn depressing. We'll get the internet that we deserve.

To be fair, the traffic you had was mostly routed by them as well. Google giveth, Google taketh away.

Callous take. You make it seem like only Google was giving here. If Google was routing users to OP's site, surely OP had something beneficial to give.

No. OP made the content Google copied with AI to their landing page, which reduced the traffic.

But I forgot that they convinced you to accept AI as a magic "make copyright disappear"-technology.

Take it one step further, the ultimate endgame is everyone consuming things through their own LLM assistant

In the future I don't even use Google but my bot does

I think the idea is the google is the bot

Yeah but google will inject ads, my future LLM bot will filter them out

Why would I want to be restricted to google’s “view” of the info

Over the past year to 1.5 years, in the sites I run, I have seen a drop in traffic from Google, which leveled off, and is now slightly rising.

I think if you look through this thread you’ll see a lot of skepticism of the AI results, and I think that is a fairly broadly held opinion. The obvious way to check the AI answer is to click through to some sources.

I think for Google to stop sending me traffic, it would have to be essentially perfect at AI answers. It will never get there, especially as so many searches are opinion-based like “what is the best mobile phone right now.”

Isn’t Google’s ads flywheel depend on them sending traffic to websites?

The web as we know it is over.

Websites will die on the vine if LLMs intermediate all the content.

The "website" of the future will be an API optimized for LLM crawlers, serving plain-text content that no end-user will ever view directly. The SEO game will change to LLMAO.

Alternatively, we can collectively "fight back" by not using Google and teaching others around us to do so as well. There are plenty of decent [1] and great (better) alternatives, where you're not the product [2]!

[1]: https://alternativeto.net/software/google-search/?license=fr...

[2]: https://alternativeto.net/software/google-search/?license=co...

I can recommend https://noai.duckduckgo.com. It works pretty well.

Fighting back is stupid, these things are inevitable, and honestly probably for the best over the long-term.

Nothing is inevitable, and long-term effects of google's move are unknowable.

There is actually another way that was just hinted at a few days ago demonstrated by the EU courts reaffirming a law from 2019 against Meta, just force google et al to compensate publishers:

https://www.epceurope.eu/post/epc-welcomes-landmark-cjeu-rul...

That translates to "Force Google to give money to these specific organizations and newspapers which EU leaders wish to benefit". It won't help any individuals who has made great websites with important and popular information.

[deleted]

Well that's just an example, You could mandate an API to allow bots pay their fair share; could ban google from using content it stole without compensation; could shatter google into a thousand pieces.. There are music rights organizations with small time artists to huge celebrities, they are strong organizations that collect revenue from big platforms and redistribute it to all artists as well. Lots and lots of ways to address this problem.

My appeal is just to realize that our implicit assumption that we can't do anything ever at all besides appealing to completely ineffectual individual action is in and of itself a strongly ideological and politically radical position to take.

> Websites will die on the vine if LLMs intermediate all the content.

The current zeitgeist of them will, but I think not all.

My first website (GeoCities) was either before Google existed or very close to it. Connected to people via WebRings and directory listings. More recently, RSS feeds.

Yeah there will likely continue be a small underground of old-style websites I guess. But you'll have to be in the loop on how to find them, and very few people will pay to advertise on them.

> very few people will pay to advertize on them

That sounds like an unalloyed plus. The perverse incentives caused by advertising have been the biggest driver of the web's decline, IMO.

That sounds like absolute hell

Here is what I think the future web may look like:

   1) Sites will have mcp / APIs for LLMs. So that when I ask my AI Agent du jour. It can call any of the sites where I have subscriptions for information. 
   2) Sites that are passion projects will be harvested by our LLM overlords.
   3) Sites that people don't type into their web browser and need ad revenue will die.
   4) SEO will finally die.

> SEO will finally die.

On the contrary, it will flourish. It’s just that it’ll shift to whatever can trick LLMs into recommending your product.

https://www.anthropic.com/research/small-samples-poison

https://www.bbc.com/future/article/20260218-i-hacked-chatgpt...

Or more likely move towards substack or newsletters where the pitch is - Don’t let the LLM chose the output for you, go directly to our Substack/newsletter instead.

This will happen especially with things like conspiracy theories because the choice might be to pollute the output or share the general consensus. Like searches for Apollo landing conspiracy theories can either chose to present “alternate facts” so that people can “do their own research” and conclude it is fake or LLM auto corrects to “Apollo landing happened”.

Newsletters have a webview fallback with a public URL that makes them just as susceptible to scraping. If that ever gets fixed, Google will just scrape the full-text content in Gmail instead.

Newsletters have been around forever and never taken off like the open web and free blogging have. Slapping a Stripe integration on the backend hasn't led to Substack becoming a sustainable business not propped up by VC cash.

The truth is out there!

[deleted]

This was the promise of Bing that never materialized.

They will not do that because their cash cow ad revenue depends on it

He also spins a lot of trash talk about an industry he's never personally worked in as any kind of engineer at all. He's a "Journalist Covering Tech" without a degree in journalism, so he's not even a "Tech Journalist"; might as well be the blogger character from Silicon Valley.

> Bachelor of Arts degree in political science from the University of Chicago.

His hot takes are best ignored, is just convenient click bait for their entire negativity angle.

Brendan Carr is that you?