I think it is too late. There is non zero profit of people visiting your content, and there is close to zero cost to make it. It is the same problem with music, in fact I search youtube music only with before:2022.
I recently wrote about the dead internet https://punkx.org/jackdoe/zero.txt out of frustration.
I used to fight against it, I thought we should do "proof of humanity", or create rings of trust for humans, but now I think the ship has sailed.
Today a colleague was sharing their screen on google docs and a big "USE GEMINI AI TO WRITE THE DOCUMENT" button was front and center. I am fairly certain that by end of year most words you read will be tokens.
I am working towards moving my pi-hole from blacklist to whitelist, and after that just using local indexes with some datahorading. (squid, wikipedia, SO, rfcs, libc, kernel.git etc)
Maybe in the future we just exchange local copies of our local "internet" via sdcards, like in Cuba's Sneakernet[1] El Paquete Semenal[2].
> thought we should do "proof of humanity"
I thought about this in another context and then I realized: what system is going to declare you're human or not? AI of course
Uhh, that's a lot of links: https://download.kiwix.org/zim/wikipedia/
Where are the explanations what all of them mean? What is (nothing) vs `maxi` vs `mini` vs `nopic`? What is `100` vs `all` vs `top1m` vs `top` vs `wp1-0.8`?
https://download.kiwix.org/zim/README
Mini is the introduction and infobox of all articles, nopic is the full articles with no pictures, maxi is full articles with (small) images. Other tags are categories (football, geography, etc.)
100 is the top 100 articles, top1m is top 1 million, 0.8 is (inexplicably) the top 45k articles.
My recommendation: sort by size and download the largest one you can accommodate in the language you prefer. wikipedia_en_all_maxi_2025-08.zim is all wikipedia articles, with images, as of 2025-08 and it's a paltry 111G.
Kiwix publishes a library here, but it's equally unhelpful: https://library.kiwix.org/