It feels sometimes like optimizations are only starting.

I’m beginning to suspect the closed SOTA labs were doing all these optimisations, keeping quiet about it, and just charging us out the yinyang for inference.