I agree. After all, benchmarks don't mean much, but I guess they are fine as long as they keep measuring the same thing every time. Also, the context matter. In my case, I see a huge difference between the gains at work vs those at home on a personal project where I don't have to worry about corporate policies, security, correctness, standards, etc. I can let the LLM fly and not worry about losing my job in record time.