One thing I rarely see mentioned is that often creating code by hand is simply faster (at least for me) than using AI. Creating a plan for AI, waiting for execution, verifying, prompting again etc. can take more time than just doing it on my own with a plan in my head (and maybe some notes). Creating something from scratch or doing advanced refactoring is almost always faster with AI, but most of my daily tasks are bugs or features that are 10% coding and 90% knowing how to do it.

> 10% coding and 90% knowing how to do it

I think this is the main point where many people’s work differs. Most of my work I know roughly what needs changing and how things are structured but I jump between codebases often enough that I can’t always remember the exact classes/functions where changes are needed. But I can vaguely gesture at those specific changes that need to be made and have the AI find the places that need changing and then I can review the result.

I rarely get the luxury of working in a single codebase for a long enough period of time to get so familiar with it that I can jump to particular functions without much thought. That means AI is usually a better starting point than me fumbling around trying to find what I think exists but I don’t know where it is.

I've heard people say that these coding agents are just tools and don't replace the thinking. That's fine but the problem for me is that the act of coding is when I do my thinking!

I'm thinking about how to solve the problem and how to express it in the programming language such that it is easy to maintain. Getting someone/something else to do that doesn't help me.

But different strokes for different folks, I suppose.

Yes, it's often faster if you sit around waiting. What I will do instead is prompt the AI to create various plans, do other stuff while they do, review and approve the plans, do other stuff while multiple plans are being implemented, and then review and revise the output.

And I have the AI deal with "knowing how to do it" as well. Often it's slower to have it do enough research to know how to do it, but my time is more expensive than Claude's time, and so as long as I'm not sitting around waiting it's a net win.

I do this too, but then you need some method to handle it, because now you have to read and test and verify multiple work streams. It can become overwhelming. In the past week I had the following problems from parallel agents:

Gemini running an benchmark- everything ran smoothly for an hour. But on verification it had hallucinated the model used for judging, invalidating the whole run.

Another task used Opus and I manually specified the model to use. It still used the wrong model.

This type of hallucination has happened to me at least 4-5 times in the past fortnight using opus 4.6 and gemini-3.1-pro. GLM-5 does not seem to hallucinate so much.

So if you are not actively monitoring your agent and making the corrections, you need something else that is.

You need a harness, yes, and you need quality gates the agent can't mess with, and that just kicks the work back with a stern message to fix the problems. Otherwise you're wasting your time reviewing incomplete work.

Here is an example where the prompt was only a few hundred tokens and the output reasoning chain was correct, but the actual function call was wrong https://x.com/xundecidability/status/2005647216741105962?s=2...

Glancing at what it's doing is part of your multitasking rounds.

Also instead of just prompting, having it write a quick summary of exactly what it will do where the AI writes a plan including class names branch names file locations specific tests etc. is helpful before I hit go, since the code outline is smaller and quicker to correct.

That takes more wall clock time per agent, but gets better results, so fewer redo steps.

Here is an example where the prompt was only a few hundred tokens and the output reasoning chain was correct, but the actual function call was wrong https://x.com/xundecidability/status/2005647216741105962?s=2...

I as a human have typos too - and sometimes they're the hardest thing to catch in code review because you know what you meant.

Hopefully there is some of lint process to catch my human hallucinations and typos.

This sounds like one recipe for burnout, much like Aderal was making everyone code faster until their brain couldn’t keep up with its own backlog.

>And I have the AI deal with "knowing how to do it" as well. Often it's slower to have it do enough research to know how to do it

This is exactly the sort of future I'm afraid of. Where the people who are ostensibly hired to know how stuff works, out source that understanding to their LLMs. If you don't know how the system works while building, what are you going to when it breaks? Continue to throw your LLM at it? At what point do you just outsource your entire brain?

For me it _can_ be faster to code than to instruct but it takes me significantly less effort to write the prompt than the actual code. So a few hours of concentrates coding leave me completely drained of energy while after a few hours with the agents I still have a lot of mental energy. That's the huge difference for me and I don't want to go back.

Thats interesting. While i do get mentally tired after a session of focused coding, i feel like i have accomplished something. Using AI for coding feels similar to spending hours doom scrolling reels. Less engaging but Im drained as hell at the end.

I'd argue you still have to stay engaged, if not more-so. Its a different type of engagement. Look at you: You're the CTO now.

It's hard to be engaged when you are constantly jumping from one thing/prompt to another vs you are actually doing the work.

My way of phrasing this: I need to activate my personal transformers on my inner embeddings space to really figure what is it that I truly want to write.

I delegate to agents what I hate doing, e.g. when creating a SaaS web app, the last thing I want to waste my time on is the landing page with about/pricing/login and Stripe integration frontend/backend - I'll just tell Claude Code (with Qwen3-Coder-Next-Q8 running locally on RTX Pro 6000) to make all this basic stuff for me so that I can focus on the actual core of the app. It then churns for half an hour, spews out the first version where I need to spend another half an hour to fix bugs by pointing errors to Claude Code and then in 1 hour it's all done. I can also tell it to avoid all the node.js garbage and do it all in plain HTML/JS/CSS.

The rebuttal to this would be that you can do many such tasks in parallel.

I’m not sure it’s really true in practice yet, but that would certainly be the claim.

But can you mentally "keep hold" (for lack of a better term) of those tasks that are getting executed in parallel? Honestly asking.

Because, after they're done/have finished executing, I guess you still have to "check" their output, integrate their results into the bigger project they're (supposedly) part of etc, and for me the context-switching required to do all that is mentally taxing. But maybe this only happens because my brain is not young enough, that's why I'm asking.

The type of dev who is allowing AI to do all of their work does not care about the quality of said work.

I think the difference is that you're applying a standard of correctness or personal understanding of the code you're pushing that is being relaxed in the "agentic workflows"

I have the AI integrate their results themselves. That's if anything one of the things they do best. I also have them do reviews and test their own work first before I check it, and that usually makes the remaining verification fairly quick and painless.

That’s why we won’t plan anymore or compile it’ll just execute https://jperla.com/blog/claude-electron-not-claudevm