Hacker News

Grandparent testimony of success, & parent testimony of frustration, are both just wispy random gossip when they don't specify which LLMs delivered the reported experiences.

The quality varies wildly across models & versions.

With humans, the statement "my tutor was great" and "my tutor was awful" reflect very little on "tutoring" in general, and are barely even responses to each other withou more specificity about the quality of tutor involved.

Same with AI models.