> LLMs not being able to detect correctness is just demonstrably false if you play around with LLM agents a bit.
How is telling you that this method of determining correctness is incapable of doing so, only tangential?
> LLMs not being able to detect correctness is just demonstrably false if you play around with LLM agents a bit.
How is telling you that this method of determining correctness is incapable of doing so, only tangential?
Correctness and proven correctness are different things. I suspect you're a big Rocq Prover fan.