Nearly half of online articles are now AI-generated. [0]

[0]: https://graphite.io/five-percent/ai-now-writes-as-many-onlin...

This is good information, but a bit superficial - before AI, what percentage of online articles were generated from templates? What was written by content generation farms? Fiverrr and co pay-per-word writers?

I suspect that market has been more affected than anything.

I don't know, at least half of the front page seems to be LLM generated at any given time on HN. I couldn't say half seemed templated a few years ago.

I imagine something like 98% of articles also get less than 100 views. So the question is more about the articles you're reading rather than articles in general.

If one cant remember what they generated, whats the point in generating? Half of those who write articles do not remember what the AI put in it... Reviewing has become a slop work by humans!

I'd say even half of my Youtube feed nowadays is.

1. Find some nicher but interesting topic (e.g. some historical event like Lepanto's battle)

2. Have AI generate the content of the 20 minutes video by collecting information about it online

3. Have AI generate the video

4. Have AI generate a realistic voice to comment on the video

5. Upload it without mentioning it's all AI generated

6. Have me get mad 4 minutes into the video because footage/paintings referring to that battle...do not exist at all...slowly realize it was all AI generated

The YouTube algorithm got unbearable to me even before the mudslide of AI content.

I highly recommend using an extension like Unhook and disabling all algorithmic recommendations such as the Home feed, sidebar/endscreen recommendations etc. The only way I interface with YouTube now is through the subscriptions page which shows me videos from creators I follow in chronological order.

I started "do not recommending" and removing channels that had any stink of AI. I also removed some subscriptions from major YouTube channels. My subscription and homepages are much quieter now and easier to parse.

There is a "do not recommend this channel" option somewhere

The rate the bots are generating content / new channels is far faster than you can click on that optin.

> We build on our prior research by using three different AI detectors (Pangram, GPTZero, Copyleaks). We independently evaluate each to show that the false positive rates and average false negative rates are consistently below 2%. Each AI detector shows a similar trend.

This is all bullshit, none of those actually work, and the false-positives rates are sky-high. I'm not sure how any serious person have tried out any of those services and came away with the impression of "Well, better than nothing" because literally, it seems the opposite.

The detectors aren’t great but they aren’t really the issue. The fact that LLMs make it so easy to impersonate human communication is precisely the problem here. There cannot be a reliable way to identify if something is from a human or not. And the ease of access and low price makes using LLM generated content a no brainer, you have to actively go out of your way to produce human generated content.

We are building a future where human contact will be scarce

> We are building a future where human contact will be scarce

Yes, until you remember there is a world outside of the screen, where people build things with their hands, use their physically to play instruments for others, paint beautiful things for others to see physically and so much more.

"Humanness" online been dead for decades already, if you want humanness you need to step outside, or at least invite other humans home.

There is a meaningful difference between “humans online are tribalistic” and “content consumed by humans is generated by machines”. The IRL world isn’t safe either, books, newspapers, advertising, speeches are/will be heavily LLM made. Political parties are using LLMs. The IRL humans are relying on what their LLMs summarized or searched for them.

The same way the online world has never actually been that distinct from the offline world, one is merged with the other and they influence each others.

There has been of humanness online of you do not look for it on social medias. But that’s now breaking down, because we developed a technology designed to impersonate human communication

Right, what I was talking about things that generally aren't done by AI. People aren't building sculptures with AI, no graffiti is made with AI, the oil paintings you can see in galleries aren't AI, the DJ that fucks up during a performance isn't AI.

There is so much humanity in the world outside of the screen, and it's really easy to see what is authentically made, ignore the rest. Find live events with real other humans, there are a ton of them out there, doesn't really matter how people find the events, as long as we put our bodies in the same physical space.

I hope you’re right. Over the past month or so I personally started to feel really pessimistic about AI development. I really don’t know how much of those human spaces are safe from AI. Yes you can go to a drawing course or music festival and see human performances. But how do you then stay in contact with those people? The answer is very likely via software, meaning there is still this question of “am I interacting with a human? Or are they copy-pasting from ChatGPT?”. A friend you met shares a new song, is it really them playing or did they generate that track?

Just the fact that we have some level of doubt means we already lost something.

That being said, sure, live in the physical world and build social contacts. I’m all for it.

So if these do not work, to what do you attribute the rising positive rate?

Humans writing more like LLMs, just like new LLMs write more like humans, it's all coalescing into one.

I've copied-pasted comments I made on HN from like 2020 and had it tell me it's "100% AI". I've seen examples where the services claim "100% AI" because there was no normal dashes, only em-dashes. Even have a recent example from HN itself: https://news.ycombinator.com/item?id=48165690

> This reads very AI. Pangram [0] agrees [1]. [0] Not perfect, but I think as good evidence as any: https://arxiv.org/pdf/2501.15654 [1] https://www.pangram.com/history/44cd07d3-ba94-4331-8c7f-a626...

Said Pangram report literally citing the single evidence of em-dashes...

Your evidence seems very anecdata. The graphite.io study does make an effort to quantify the false positive and false negative rates of the three detectors, rather than just saying “they work”. They generate 2000 ai articles and ask the detectors to evaluate them, measuring the false negatives (articles falsely IDd as human written); and they use a separate pre-AI dataset (years 2000-2022) to determine false positives.

Yeah, I suppose it is, I haven't finished my dissertation on it yet, I'll get right on that :)

Throughout them being available I've tried them every now and then, both with AI generated trash and my own pre-LLM writings, and had about 0% success in getting them to accurately report what it actually is. Maybe my writing style and what specific LLM you use matters a lot, I'm sure these platform's training data is mostly from the mainstream models so as soon as you use anything else, they'll get trivially lost. But again, I don't have any evidence and proof behind this, based only on when I've tried to evaluate them myself in the past.

If you need an AI detector to figure out if something is AI or not, surely that means the AI is so good that there is no need to detect whether it is AI or not, because it is indistinguishable from writings by a human when read by humans?

I mean this is an article coming from an SEO company that's really just trying to advertise its services in the end. Their methodology seems very loose.

Are you an AI agent trying to gaslight us?

Just a boring old organic human tired of other organic beings falling for obvious bullshit most likely made up by machines convincing humans with something like "you really have a neat idea here, the world will appreciate you making this into a product".