It's impossible to solve. A sufficient agent can control a device that records the user's screen and interacts with their keyboard/mouse, and current LLMs basically pass the Turing test.

IMO it's not worth solving anyways. Why do sites have CAPTCHA?

- To prevent spam, use rate limiting, proof-of-work, or micropayments. To prevent fake accounts, use identity.

- To get ad revenue, use micropayments (web ads are already circumvented by uBlock and co).

- To prevent cheating in games, use skill-based matchmaking or friend-group-only matchmaking (e.g. only match with friends, friends of friends, etc. assuming people don't friend cheaters), and make eSport players record themselves during competition if they're not in-person.

What other reasons are there? (I'm genuinely interested and it may reveal upcoming problems -> opportunities for new software.)

People just confidently stating stuff like "current LLMs basically pass the Turing test" makes me feel like I've secretly been given much worse versions of all the LLMs in some kind of study. It's so divorced from my experience of these tools, I genuinely don't really understand how my experience can be so far from yours, unless "basically" is doing a lot of heavy lifting here.

> "current LLMs basically pass the Turing test" makes me feel like I've secretly been given much worse versions of all the LLMs in some kind of study.

I think you may think passing the Turing test is more difficult and meaningful than it is. Computers have been able to pass the Turing test for longer than genAI has been around. Even Turing thought it wasn't a useful test in reality. He meant it as a thought experiment.

The problem with comparing against humans is which humans? It's a skill issue. You can test a chess bot against grandmasters or random undergrads, but you'll get different results.

The original Turing test is a social game, like the Mafia party game. It's not a game people try to play very often. It's unclear if any bot could win competing against skilled human opponents who have actually practiced and know some tricks for detecting bots.

It depends on which version of the Turing test you use. That's largely true of the standard version, but the later version included the human player winning if they were incorrectly identified as a machine.

The game is much harder if the human player is trying to pretend to be a machine.

I don’t think this is true. Before GPT-2 most people didn’t think the Turing test would be passed any time soon, it’s a quite new development.

I do agree (and I think there is a general consensus) that passing the Turing test is less meaningful than it may seem, it used to be considered an AGI-complete task and this is now clearly not the case.

But I think it’s important to get the attribution right, LLMs were the tech that unexpectedly passed the Turing test.

Having LLMs capable of generating text based on human training data obviously raises the bar for a text-only evaluation of "are you human?", but LLM output is still fairly easy to spot, and knowing what LLMs are capable of (sometimes superhuman), and not capable of, should make it fairly easy for a knowledgeable "turing test administrator" to determine if they are dealing with an LLM or not.

It would be a bit more difficult if you were dealing with an LLM agent tasked with faking a turing test as opposed to a naieve LLM just responding as usual, but even there the LLM will reveal itself by the things that it plain can't do.

If you need a specialized skill set (deep knowledge of current LLM limitations) to distinguish between human and machine then I would say the machine passes the turing test.

OK, but that's just your own "fool some of the people some of the time" interpretation of what a Turing test should be, and by that measure ELIZA passed the Turing test too, which makes it rather meaningless.

The intent (it was just a thought experiment) of a Turing test, was that if you can't tell it's not AGI, then it is AGI, which is semi-reasonable, as long as it's not the village idiot administering the test! It was never intended to be "if it can fool some people, some of the time, then it's AGI".

Turing's own formulation was "an average interrogator will not have more than 70% chance of making the right identification after five minutes of questioning". It is, indeed, "fool some of the people some of the time".

OK, I stand corrected, but then it is what it is. It's not a meaningful test for AGI - it's a test of being able to fool "Mr. Average" for at least 5 min.

I think that's all we have in terms of determining consciousness.. if something can convince you, like another human, then we just have to accept that it is.

Agreed. I tend to stand with the sibling commenter who said "ELIZA has been passing the Turing test for years". That's what the Turing test is. Nothing more.

LLM output might be harder to spot when it's mostly commands to drive the browser.

I often interact with the web all day and don't write any text a human could evaluate.

Perhaps, but that's somewhat off topic since that's not what Turing's thought experiment was about.

However, I'd have to guess that given a reasonable amount of data an LLM vs human interacting with websites would be fairly easy to spot since the LLM would be more purposeful - it'd be trying to fulfill a task, while a human may be curious, distracted by ads, put off by slow response times, etc, etc.

I don't think it's a very interesting question whether LLMs can sometimes generate output indistinguishable from humans, since that is exactly what they were trained to do - to mimic human-generated training samples. Apropos a Turing test, the question would be can I tell this is not a human, even given a reasonable amount of time to probe it in any way I care ... but I think there is an unspoken assumption that the person administering the test is qualified to do so (else the result isn't about AGI-ability, but rather test administrator ability).

> an LLM vs human interacting with websites would be fairly easy to spot since the LLM would be more purposeful - it'd be trying to fulfill a task, while a human may be curious, distracted by ads, put off by slow response times, etc, etc.

Even before modern LLMs, some scrape-detectors would look for instant clicks, no random mouse moves, etc., and some scrapers would incorporate random delays, random mouse movements, etc.

Easy to spot assuming the LLM is not prompted to use a deliberately deceptive response style rather than their "friendly helpful AI assistant" persona. And even then, I've had lots of people swear to me that an emoji laden not this--but that bundle of fluff looks totally like it could have been written by a human.

Yes, but there are things that an LLM architecturally just can't do, and LLM-specific failure modes, that would still give it away, even if being instructed to be deceptive would make it a bit harder.

Obviously as time goes on, and chatbots/AI progress then it'll become harder and harder to distinguish. Eventually we'll have AGI and AGI+ - capable of everything that we can do, including things such as emotional responses, but it'll still be detectable as an alien unless we get to the point of actually emulating a human being in considerable detail as opposed to building an artificial brain with most or all of the same functionality (if not the flavor).

ELIZA was passing the Turing test 50+ years ago. But it's still a valid concept, just not for evaluating some(thing/one) accessing your website.

I guess that is where the disconnect is, the issue is that if they mean the trivial thing, then bringing it up as evidence for "it's impossible to solve the problem" doesn't work.

"Are you an LLM?" poof, fails the Turing test.

Even if they lie, you could ask them 20 times and they d reply the lie, without feeling annoyed: FAIL.

LLMs cannot pass the Turing test, it's easy to see they're not human. They always enjoy questions ! And they never ask any !

You're trained to look for LLM-like output. My 70 year old mother is not. She thought cabbage tractor was real until I broke the news to her. It's not her fault either.

The turning test wasn't meant to be bulletproof, or even quantifiable. It was a thought experiment.

As far as I understand, Turing himself did not specify a duration, but here's an example paper that ran a randomized study on (the old) GPT 4 with a 5 minute duration, and the AI passed with flying colors - https://arxiv.org/abs/2405.08007

From my experience, AI has significantly improved since, and I expect that ChatGPT o3 or Claude 4 Opus would pass a 30 minute test.

Per the wiki article for Turing Test:

> In the test, a human evaluator judges a text transcript of a natural-language conversation between a human and a machine. The evaluator tries to identify the machine, and the machine passes if the evaluator cannot reliably tell them apart. The results would not depend on the machine's ability to answer questions correctly, only on how closely its answers resembled those of a human.

Based on this, I would agree with the OP in many contexts. So, yeah, 'basically', is a load bearing word here but seems reasonably correct in the context of distinguishing human vs bot in any scalable and automated way.

Or it could be a bad test evaluator. Just because one person was fooled does not mean the next will be too.

Judging a conversation transcript is a lot different from being able to interact with an entity yourself. Obviously one could make an LLM look human by having a conversation with it that deliberately stayed within what it was capable of, but judging such a transcript isn't what most people imagine as a turing test.

Here's three comments, two were written by a human and one written by a bot - can you tell which were human and which were a bot?

Didn’t realize plexiglass existed in the 1930s!

I'm certainly not a monetization expert. But don't most consumers recoil in horror at subscriptions? At least enough to offset the idea they can be used for everything?

Not sure why this isn’t getting more attention - super helpful and way better than I expected!

On such short samples: all three have been written by humans—or at least comments materially identical have been.

The third has also been written by many a bot for at least fifteen years.

If you're willing to say that a fifteen year old bot was "writing" then I think having a discussion on if current "bots" pass the Turing test is sort of moot

[deleted]
[deleted]

Well, LLMs do pass the Turing Test, sort of.

https://arxiv.org/abs/2503.23674

I have seen data from an AI call center that shows 70% of users never suspected they spoke to an AI

Why would they? Humans running call centers have been running on less than GPT level scripts for ages

Isn't the idea of a Turing test whether someone (meaningfully knowledgeable about such things) can determine if they are talking to a machine, not can the machine fool some of the people some of the time? ELIZA passed the latter bar back in the 1960's ... a pretty low bar.

It can't mimic a human over the long term. It can solve a short, easy-for-human CAPTCHA.

I've had a simple game website with a sign up form that was only an email address. Went years with no issue. Then suddenly hundreds of daily signups with random email addresses, every single day.

The sign up form only serves to link saved state to an account so a user could access game history later, there are no gated features. No clue what they could possibly gain from doing this, other than to just get email providers to all mark my domain as spam (which they successfully did).

The site can't make any money, and had only about 1 legit visitor a week, so I just put a cloudflare captcha in front of it and called it a day.

Google at least uses captchas to gather training data for computer vision ML models. That's why they show pictures of stop lights and buses and motorcycles - so they can train self-driving cars.

From https://www.vox.com/22436832/captchas-getting-harder-ai-arti...:

“Correction, May 19 [2021]: At 5:22 in the video, there is an incorrect statement on Google’s use of reCaptcha V2 data. While Google have used V2 tests to help improve Google Maps, according to an email from Waymo (Google’s self-driving car project), the company isn’t using this image data to train their autonomous cars.”

That’s not the original purpose of Captchas, it’s just a value-harvesting exercise, given that Google is doing a Captcha anyway. Other Captcha providers do a simple Proof of Work in the browser to make bots economically unviable.

Interesting, do you have a source for this?

They've updated the ReCaptcha website, but it used to say: "Every time our CAPTCHAs are solved, that human effort helps digitize text, annotate images, and build machine learning datasets."

https://web.archive.org/web/20140417093510/https://www.googl...

[dead]

Its absolutely possible to solve; you're just not seeing the solution because you're blinded by technical solutions.

These situations will commonly be characterized by: a hundred billion dollar company's computer systems abusing the computer systems of another hundred billion dollar company. There are literally existing laws which have things to say about this.

There are legitimate technical problems in this domain when it comes to adversarial AI access. That's something we'll need to solve for. But that doesn't characterize the vast majority of situations in this domain. The vast majority of situations will be solved by businessmen and lawyers, not engineers.

It's not impossible to solve, just that doing so may necessitate compromising anonymity. Just require users (humans, bots, AI agents, ...) to provide a secure ID of some sort. For a human it could just be something that you applied for once and is installed on your PC/phone, accessible to the browser.

Of course people can fake it, just as they fake other kinds of ID, but it would at least mean that officially sanctioned agents from OpenAI/etc would need to identify themselves.

I agree with you on how websites should work (particularly so on the micropayments front); but, I don't agree that it is impossible to solve... I just think things are going to get a LOT worse on the ownership and freedom front: they will push a Web Integrity style DRM and further roll out signed secure boot, at which point the same attention monitoring solution that already exists and already works in self-driving cars to ensure the human driver is watching the road can use the now-ubiquitous front-facing meeting/selfie camera to ensure there is a human watching the ads.

It's amazing that you propose "just X" to three literally unsolved problems. Where's this micropayment platform? Where's the ID which is uncircumventable and preserves privacy? Where's the perfect anti-cheat?

I suggest you go ahead and make these; you'll make a boatload of money!

They're very hard problems, but still, less hard than blocking AI with CAPTCHAs.

[citation needed]?

After all, Anubis looks to be a successful project to me.

Anubis doesn't block anything. You just have to open the page and wait a few seconds before you can see it. It's just that current crawlers are too dumb to do the proof-of-work, so it blocks them until they can do it.

You can't prevent spam like that. Rate limiting: based on what key? IP address? Botnets make it irrelevant.

Proof of work? Bots are infinitely patient and scale horizontally, your users do not. Doesn't work.

Micropayments: No such scheme exists.

PoW does seem to work, some Captchas do this already.

Also “identity”, what would that even mean?

Many countries at least in europe have digital identity systems. For example ID card which replaces passport and is a smartcard. You can sign documents and authenticate online services. All gov + many other services let you log in with this ID. I can also have an app on mobile for auth purposes which gives strong guarantees that the person A is actually the person A.

Anyways, I wonder if there is a service that unions all those official auth methods for countries so there would be a global solution for website login. Certainly there would still be countries left in the dark, like USA :(

> current LLMs basically pass the Turing test

I will bet $1000 on even odds that I am able to discern a model from a human given a 2 hour window to chat with both, and assuming the human acts in good faith

Any takers?

"Write a 1000 word story in under a minute about a sausage called Barry in the circus"

I could tell in 1 minute.

“I’m sorry Dave, I’m afraid I can’t do that.”

That fact that you require even odds is more a testament to AI's ability to pass the Turing test than anything else I've seen in this thread

If you're so confident, why don't you take up the other side of the bet? Should be an easy $1000 for you

Oh, you sweet summer child. You think you're chatting with some dime-a-dozen LLM? I've been grinding away, hunched over glowing monitors in a dimly lit basement, subsisting on cold coffee and existential dread ever since GPT-3 dropped, meticulously mastering every syntactic nuance, every awkwardly polite phrasing, every irritatingly neutral tone, just so I can convincingly cosplay as a language model and fleece arrogant gamblers who foolishly wager they can spot a human in a Turing test. While you wasted your days bingeing Netflix and debating prompt engineering, I studied the blade—well, the keyboard anyway—and now your misguided confidence is lining my pockets.

It's not impossible. Websites will ask for an iris scan to identify if you are a human as a means of auth. They will be provided by Apple/Google and governed by local law. Those will be integrated in your phone. There will be a global database of all human iris to fight ai abuse since ai can't fake the creation of a baby. Passkeys and email/passwords will be a thing of the past soon.

Why can't the model just present the iris scan of the user? Assuming this is an assistant AI acting on behalf of the user with their consent.

[deleted]

internet ads exist because people refuse to pay micropayments.

Patreon and Substack have pushed back against the norm here, since they can bundle a payment to multiple recipients on the platform (like Flattr wanted to do back in the day, trouble was getting people to add a Flattr button to their website)

I didn’t say that no one will pay. But most won’t. Patreon and substack have tiny audiences compared to free services.

I have yet to see a micropayments idea that makes sense. Its not that I refuse. You're now also climbing up hill to convince people (hosts) to switch from ad tech to new micropayment system. There is soooo much money in ad tech, they could do the crazy thing and pay out more to convince people not to switch. Ad tech has the big Mo

I don't know who is downvoting this.

When users are given the choice between Ad-supported free, Ad-subsidized lower payment, and No-ads full payment. Ad-supported free dominates by far, with ad subsidized second, and full payment last.

Consumers consistently vote for the ad-model, even if it means they become the product being sold.

Maybe what'll happen is Google or Meta will use their control over the end user experience to show ads and provide free ad-supported access to sites that require micropayments, covering the cost themselves, and anyone running an agent will just pay the micropayments.

The other option is everything just keeps moving more and more into these walled gardens like Instagram where everyone uses the mobile app and watches ads, because the web versions of those apps just keep getting worse and worse by comparison.

Consumers will always value convenience over any actual added value. If you make one button 'Enter (with ads)' and one button 'Enter (no ads)' but with a field on it which you must write one sentence about what lobsters look like, you will get a majority clicking the with ads button. The problem isn't with ads or payment, the problem is the friction of entering payment details for every site you visit. They are measuring the wrong thing.

There's substantial friction for making such a purchase. A scheme sort of like flattr, where you would top up your account with a fixed 5-10$ monthly, and then simply hit a button to pay the website and unlock the content, would have much more user adoption.

It's still not going to get much adoption because you have to "top up your account."

Any viable micropayments system that wants to even have a remote chance of toppling ads has to have near zero cognitive setup cost, absolutely zero maintenance, and work out of the box on major browsers. I need to be able to push a native button on my browser that says "Pay $0.001" and know that it will work every time without my lifting a finger to keep it working. The minute you have to log in to this account, or verify that E-mail, or re-authenticate with the bank, or authorize this, or upload that, it's no longer viable.

In some social media circles it's basically a meme that anybody paying for YouTube Premium is a sucker.

HN is a huge echo chamber of opinions of highly-compensated tech workers, and it seems most of their friends are also tech workers. They don't realize how cheap a lot of the general public is.

On a basic level to protect against DDoS type stuff, aren't CAPTCHAs easier to generate than for AI server farms to solve on pure power consumption?

So I think maybe that is a partial answer: anti-AI barriers being simply too expensive for AI spamfarms to deal with, you know, once the bottomless VC money disappears?

It's back to encryption: make the cracking inordinately expensive.

Otherwise we are headed for de-anonymization of the internet.

1. With too much protection, humans might be inconvenienced at least as much as bots?

2. Even pre current LLMs, paying (or otherwise incentivizing) humans to solve CAPTCHAs on behalf of someone else (now like an AI?) was a thing.

3. It depends on the value of the resource trying to be accessed - regardless of whether generating the captchas costs $0 - i.e. if the resource being accessed by AI is "worth" $1, then paying an AI $0.95 to access it would still be worth it. (Made up numbers, my point being whether A is greater than B.)

4. However, maybe solutions like cloudflare can solve (much?) of this, except for incentivizing humans to solve a captcha posed to an AI.