Not really. SOTA vs non SOTA is "can I get my coding work actually done today" vs. "this can do customer support chat"
It is like car vs. kick scooter.
Not really. SOTA vs non SOTA is "can I get my coding work actually done today" vs. "this can do customer support chat"
It is like car vs. kick scooter.
> "can I get my coding work actually done today" vs. "this can do customer support chat"
I think you need to define "can get coding work done" for this to make sense. Ive been using GPT-3 back-then for basic scripts, does that count ? Or only Claude-Code ?
I also think this is a false dichotomy, if you look at the Project Vend project or Vending-Bench, customer support etc. is at no means trivial. (Old but great story https://www.businessinsider.com/car-dealership-chevrolet-cha...)
It really isn't. We get coding work actually done today on Opus 4.5. That's not SOTA any more, and anything proximate to that level, even quite loosely, is genuinely useful.
OK we are in Opus 4.5 is not SOTA. Right by that definition .... yes you are right.
I mean its almost halve a year, i think that counts ?