Hacker News

fsh 2 months ago [ - ]

Companies are optimizing for all the big benchmarks. This is why there is so little correlation between benchmark performance and real world performance now.

czhu12 2 months ago [ - ]

Isn’t there? I mean, Claude code has been my biggest usecase and it basically one shots everything now

fsh 2 months ago [ - ]

Yes, LLMs have become extremely good at coding (not software engineer though). But try using them for anything original that cannot be adapted from GitHub and Stack Overflow. I haven't seen much improvement at all at such tasks.

dboreham 2 months ago [ - ]

Strongly disagree with this. And I'm going to provide as much evidence as you did.

WarmWash 2 months ago [ - ]

No shot, their classic engineering ability has exploded too.

The amount of information available online about optics is probably <0.001% of what is available for software, and they can just breeze through modeling solutions. A year ago was immediate face-planting.

The gains are likely coming from exactly where they say they are coming from - scaling compute.