That shouldn't be used to judge other models - it's never been true for Grok.