They seem to be optimizing for benchmarks instead of real world use
Yeah if only Gemini performed half as well as it does on benches, we'd actually be using it.
Yeah if only Gemini performed half as well as it does on benches, we'd actually be using it.