My experience in testing actual AI written content on willing participants is that people are entirely useless at detecting AI written content with any reliability whatsoever.
My experience in testing actual AI written content on willing participants is that people are entirely useless at detecting AI written content with any reliability whatsoever.
I don't know what your experience is, according to mine there are some people who are better than chance at picking this up.
And I believe my experience is something expected. People are also certain kind of a neural network. If an LLM system is trainable to be a decent detector, I don't see a reason why at least some people couldn't be.
I haven't seen any evidence an LLM is trainable to be a decent detector for anything people have made any kind of attempts at trying to get past them. Which is as expected as access to a detector effectively makes the problem equivalent to the halting problem (you can tweak the output using a detector as judge until you have a process to bypass it). Some of them are somewhat able to recognised "raw" output.
Yes, and the problem we are having here is 'raw' output. LLMgenerated slop is zero-effort bullshit, not an elaborate scheme to prove a philosophical thesis. There is no economy for mediaworkers doing the latter.
Similar as with coding, yes, halting problem!, but we've been always reviewing code nonetheless.