So to avoid those energy-hungry LLM companies from scraping your website, you force each browser to compute a lot of hashes in a necessarily energy-hungry loop, creating, at the same time, all the kind of accessibility problems?
So to avoid those energy-hungry LLM companies from scraping your website, you force each browser to compute a lot of hashes in a necessarily energy-hungry loop, creating, at the same time, all the kind of accessibility problems?
I don’t get how people believe there’s a PoW function that both:
1. Allows access in reasonable time/battery use to me on my phone
2. Poses any meaningful challenge to the most compute-resourced organizations on the planet
I wonder how many cumulative hours of human life have been wasted waiting on Anubis.
There are a lot of people writing really bad scrapers and running them on far from high compute power systems. This is the prevent DoS because of those. The big companies are often far more clever and know they are traversing the whole internet and can come back later.
> I wonder how many cumulative hours of human life have been wasted waiting on writing comments on creamsicle reddit.
I disagree with a lot of the decisions around the design of Anubis... but resisting the current drive of the industry to ruin as much of the good faith resource donations from others is an admirable objective.
The point isn't to increase the amount of work required to the point of exhaustion, it's to require that scripts be able to offer the exact same feature set that browsers offer. The point isn't to make it impossible, it's too make it more expensive than free.
Anubis isn't trying to prevent all scraping, it's trying to reduce the abuse just enough that real requests get their fair share. You don't need to outcompute the botnet just slow them down a little.
I hate seeing the Anubis interstitial too, I've complained about it publicly already too. But it doesn't come close to the frustration of waiting 10s for an SPA to load all of the routes it'll never use before the first redraw. Clearly our industry has also decided latency is a good thing.
The vast majority of that compute is locked in AI accelerators that do the inference. Those hardwares are bad at doing anything other than that---in fact crawlers would need more residential proxies than more computes in that regard.
> I wonder how many cumulative hours of human life have been wasted waiting on Anubis.
"How dare that mugging victim fight back".
The choice is not between Anubis and no Anubis, the choice is between Anubis and my website going offline because I can't afford the $400/month that AI scrapers would cost me (yes, I checked, and yes, that's the real figure) if Anubis wasn't in front.
That makes sense, and I believe you, I'm just surprised it really deters the scrapers.
If it's dumb and it works, is it really dumb?
No it's not dumb, but I don't get how it manages to be so light still. Like I visit an Anubis-guarded site and barely have to wait. Scrapers really see that little CPU usage or wall time and back off?
They have 2 options:
Guess which one HN front-page bloggers choose? I often comment and/or flag them, but they never learn.Anubis doesn't rely on spying on the user.
Not just LLM companies, but bots in general. They were a big problem even before LLMs.