Unless the token estimates I get from using Claude are wayyy out, I burn through 5m+ tokens/day, and I'm not doing a lot of time. 500k tokens in a 24h period for $5k of hardware seems quite poor?
Unless the token estimates I get from using Claude are wayyy out, I burn through 5m+ tokens/day, and I'm not doing a lot of time. 500k tokens in a 24h period for $5k of hardware seems quite poor?
Be sure you compare inputs tokens to pre-fill rates and output tokens to generation rates.