Gross margins also don't tell the whole story, we don't know how much Azure and Amazon charge for the infrastructure and we have reasons to believe they are selling it at a massive discount (Microsoft definitely does that, as follows from their agreement with OpenAI). They get the model, OpenAI gets discounted infra.
A discounted Azure H100 will still be more than $2 per hour. Same goes for AWS. Trainium chips are new and not as effective (not saying they are bad) but still cost in the same range.
For inference, gross margins are exactly: (what companies charge per 1M tokens to the user) - (direct cost to produce that 1M tokens which is GPU costs).
I am implying that what OpenAI pays for GPU/hour is much less than $2, because of the discount. That's an assumption. It could be $1, $0.5, no?
It could still be burning money for Microsoft/Amazon