Hacker News

Cloudflare wants to shake down the Big AI™ shops.

I don't even care anymore, AI stealing the life out of everything, or Cloudflare trying to become so global internet gatekeeper, let them kill each other.

fantasizr 3 hours ago [ - ]

when the law won't protect you it creates an opportunity for a mafia like protection racket

hedora 3 hours ago [ - ]

You realize humans are going to be the first wave of collateral damage right? I already basically cannot browse the internet for technical information, since most high-quality forums are behind captchas that block my iPhone.

If I ask an agent to do it, it does better at finding the small percentage of sources not hosted by cloudflare. However, it generally cannot hit open-access / public domain sources (like the current legal code, or academic papers) because those are blocked and it respects stuff like robots.txt.

riffraff an hour ago [ - ]

I play dungeon crawl stone soup (think nethack,but with web tiles), and most of the servers are struggling because of AI crawlers downloading the morgues.

Real users are already suffering.

If (big if) the AI labs can be made to pay for the abuse, actual users win.

axus 3 hours ago [ - ]

Would you be willing for Cloudflare to "Know their customer" (you) and pay 3 cents to access the forum, instead of filling in the captcha?

gilfaethwy 3 hours ago [ - ]

Can't speak for GP, but I wouldn't - privacy is already eroding at a startling rate, and more KYC for things that really don't need it is just a further affront to human rights. (See also the FCC's recent request for comments on requiring government-issued ID to use a cell phone.)

carlosjobim 2 hours ago [ - ]

Are your human rights also violated by Spotify keeping track of what songs you listen to, or Netflix and YouTube keeping tabs on what shows you are watching?

Internet non-ad monetization will also be in the form of massive syndication, where a subscriber gets access to thousands of high quality websites, and web publishers get access to millions of subscribers. But they need to take a hint from streaming services and really make massive syndicates which includes everything for everyone for this to work.

hedora an hour ago [ - ]

Yes. In the past, in the US, library checkout records were private / not recorded, specifically to protect the right to privacy, which is specifically protected by the UN human rights charter.

The systems you described not only record that information and make it available for warrants, they also sell it, and allow warrantless searches of it in some circumstances.

colinsane 32 minutes ago [ - ]

i installed the playwright MCP to let my agent access walled sites (specifically ebay and WSJ). i noticed that 90% of the time it was bounced from a site, it just reached out to a different site that wasn't walled, and i think it's the right move: most information exists at multiple places on the web, it's cheaper and _faster_ to just skip over walled sources.

for the forum example: many forums have a policy to only allow access to attachments to logged-in users. i can't remember the last time i registered at a new forum just to view an attachment: the effect has always been to drive me elsewhere. no complaints -- these solutions work if your goal is to reduce load. i'm suspicious that they can drive monetization outside of a very few niches.

ryan_n 3 hours ago [ - ]

I thought the goal was to only charge agents a fee, which would either 1. stop agents from scraping your site non-stop and eliminate the need for a captcha, making the human experience better or 2. make the owner of the site some money in exchange for a bajillion bots scraping their content.

Maybe that's too optimistic though based on the responses in this thread.

hedora an hour ago [ - ]

If they only charge agents a fee, then people will just set up a mcp endpoint or whatever to desktop chrome/firefox.

As it is, their captchas are already blocking tons of human traffic.

The idea that the price will be low unless you access it a lot falls over due to caching. Big tech companies will cache whatever they scrape, paying for one copy. Regular people and smaller companies will not read the same thing enough to amortize the cost of the first fetch, so they’ll pay 1000’s to 1,000,000’s of times more than the monopolies per-use of a given piece of information.

If individuals set up a federated cache with open access, they’ll get sued for copyright infringement. (Even though that would solve the supposed problem: That cloudflare cannot afford to operate a cache).

The end result is that only closed agents will be allowed to (legally) read most content without paying extortion-level fees.

Also, like with YouTube and video, serving text will become a winner-takes-all proposition.