Hacker News

pwython 6 hours ago [ - ]

How many pelican riding bicycle SVGs were there before this test existed? What if the training data is being polluted with all these wonky results...

bwilliams18 4 hours ago [ - ]

I'd argue that a models ability to ignore/manage/sift through the noise added to the training set from other LLMs increases in importance and value as time goes on.

nerdsniper 5 hours ago [ - ]

You're correct. It's not as useful as it (ever?) was as a measure of performance...but it's fun and brings me joy.