Thanks! Super helpful.

I do use it the same way as you're describing on personal projects at home, in a very crude manner (pasting code snippets in llama server web UI prompt. Next will attempt OpenCode)

At work I use it in similar manner with more mature tools, but the vast majority of token spend comes from a totally different workflow: "pretend the AI is a fleet of junior/intern engineer you're delegating work to", where the agent will on its own do the implementation, commit the changes etc.

It does indeed spend a lot of tokens wandering the codebase, talking to MCPs, loading skills etc.