The method of verification has no bearing on the validity of the conclusion. I don't open a child's head because there are side effects on the functioning of the child post brain-opening. However I can look into the brain of an AI with no such side effects.

This is an example I saw 2 days ago without even searching. Here ChatGPT is telling someone that it independently ran a benchmark on it's MacBook: https://pbs.twimg.com/media/Goq-D9macAApuHy?format=jpg

I'm reasonably sure ChatGPT doesn't have a Macbook, and didn't really run the benchmarks. But It DID produce exactly what you would expect a human to say, which is what it is programmed to do. No understanding, just rote repetition.

I won't post more because there are a billion of them. LLMs are great, but they're not intelligent, they don't understand, and the output still needs validated before use. We have a long way to go, and that's ok.