Another example was Gemini 3.1 flash lite, which on high was basically just burning tokens, costing like 30x more, while giving worse answers:
https://aibenchy.com/compare/google-gemini-3-1-flash-lite-hi...
Another example was Gemini 3.1 flash lite, which on high was basically just burning tokens, costing like 30x more, while giving worse answers:
https://aibenchy.com/compare/google-gemini-3-1-flash-lite-hi...