Hacker News

This has been my suspicion since LLMs began eating the Internet. Whether it's code or writing, now that LLMs are consuming their own output, the Habsburg Jaw[1] is going to quickly become evident. It is very difficult--sometimes impossible--to know whether a given chunk of input is wholly or partially generated by an LLM. Nevertheless, filtering input may become a critical task. That expense will be passed to the consumer, and LLM prices will necessarily rise as their quality diminishes. It could become a death spiral.

If so, I, for one, will be relieved. I'm tired of LLMs trying to take over the enjoyable parts of writing and coding, and leaving the menial tasks to us humans.

[1] https://www.smithsonianmag.com/smart-news/distinctive-habsbu...

simonw 3 months ago [ - ]

Nothing I've seen from the AI labs appears to indicate that they are worried about model collapse in the slightest.

That makes sense to me, because if their models start getting worse because there's slop in the training data they can detect that and take steps to fix it.

Their entire research pipeline is about finding what makes models that score better! Why would they keep going with a technique that scored worse?

DamnInteresting 3 months ago [ - ]

> Nothing I've seen from the AI labs appears to indicate that they are worried about model collapse in the slightest.

AI labs are insufferable hype machines, they are unlikely to sow doubt about their own business models.

> they can detect that and take steps to fix it.

Each model will need an endless diet of new content to remain relevant, and over time, avoiding ingestion of LLM output (and the accompanying inbreeding depression) will likely be a tricky proposition. Not impossible, but expensive and error-prone.