The value of these models is that you can run them on your own hardware.

A company can buy a NVIDIA B300 and serve it's developers in house with unlimited tokens.