Inference is getting cheaper by the minute, because hardware is getting cheaper and also because smarter ideas like latent attention are spreading.
Inference is getting cheaper by the minute, because hardware is getting cheaper and also because smarter ideas like latent attention are spreading.