> I can't help but think that thete's got to be a better mechanism

What matters is not how good it is in isolation, but how well it scales to giant datasets and supercomputers. So far attention scales the best. It's the most "brute force"-able mechanism