Another comment mentioned the cost associated with the model. Setting that aside, wouldn't we also need to include all of the systems around the inference? I can imagine significant infrastructure and engineering needs around all of these various services, along with the work needed to keep these systems up and running.

Or are these costs just insignificant compared to inference?

All incremental costs should be included. If adding each 100,000 new customers requires 1 extra engineer you would include that. We don’t know those exact numbers though and the ratio is probably much higher than my example numbers. Inference costs likely dominate.