Hacker News

Let's not gloss over the electrical supply. These chips won't work for free.

LLM inference uses on the order of 1 Wh per query. That's under 10 meters of driving on an EV or running air conditioning for under 5 seconds.

One query is not going to be a useful benchmark when people are deploying AI swarms in loops to solve simple problems

Or a human riding a stationary bike for 36 seconds.