"Run the numbers" means "run the numbers for using agentic coding for 2 hours per day on a frontier model" not "run the numbers for a single query". The former is the worst case scenario.
Google Search's "AI", which is what you're hinting at is such a good example. Let's say there's 10 billion Google searches per day. 10 billion completions on what is going to be a very tiny, ultra finetuned model with lots of caching (including outputs).
Check out how many queries an hour of agentic coding results in. And input/completion tokens. Estimate energy usage of Opus vs something like Gemma 4 E2B. Calculate how many developers using Opus for coding 1 hour a day would equate to those 10 billion search query originated LLM calls.
You could not have provided a better example to show that without running the numbers you'll end up with assumptions that oppose reality.