I think input strategy probably accounts for the difference. Usually I'm just asking a short question with no additional context, and usually it's not the sort of thing that has one well defined answer. I'm really asking it to summarize the wisdom of the crowd, so to speak.
For example, I ask, what are the most common targets of removal in magic: the gathering? Mistral's answer is so-so, including a slew of cards you would prioritize removing, but also several you typically wouldn't, including things like mox amber, a 0 cost mana rock. Gemini flash gave far fewer examples, one for each major card type type, but all of them are definitely priority targets that often defined an entire metagame, like Tarmogoyf.
Ah yeah. I’m only grading it on its prose, formatting, ability to interpret data, and instruction following. I do not use it as a store of knowledge