It's possible that its not about "native knowledge" but about how the descriptions (which get mapped into the prompt) for each of the tools are setup (or even their order; LLM behavior can be very sensitive to not-obviously-important prompt differences.)

I'd be cautious inferring generalizations about behavior and then explanations of those generalizations from observation of a particular LLM used via a particular toolchain.

That said, that it does that in that environment is still an interesting observation.