1) Once you get it to output something you like, do you check all the lines it changed? Is there a threshold after which you just... hope?

2) No matter what the learning curve, you're using a statistical tool that outputs in probabilities. If that's fine for your workflow/company, go for it. It's just not what a lot of developers are okay with.

Of course it's a spectrum with the AI deniers in one corner and the vibe coders in the other. I personally won't be relying 100% on a tool and letting my own critical thinking atrophy, which seems to be happening, considering recent studies posted here.

I've been doing AI-assisted coding for several months now, and have found a good balance that works for me. I'm working in Typescript and React, neither of which I know particularly well (although I know ES6 very well). In most cases, AI is excellent at tasks which involve writing quasi-custom boilerplate (eg. tests which require a lot of mocking), and at answering questions of how I should do _X_ in TS/React. For the latter, those are undoubtedly questions I could eventually find the answers on Stack Overflow and deduce how to apply those answers to my specific context -- but it's orders of magnitude faster to get the AI to do that for me.

Where the AI fails is in doing anything which requires having a model of the world. I'm writing a simulator which involves agents moving through an environment. A small change in agent behaviour may take many steps of the simulator to produce consequential effects, and thinking through how that happens -- or the reverse: reasoning about the possible upstream causes of some emergent macroscopic behaviour -- requires a mental model of the simulation process, and AI absolutely does _not_ have that. It doesn't know that it doesn't have that, and will therefore hallucinate wildly as it grasps at an answer. Sometimes those hallucinations will even hit the mark. But on the whole, if a mental model is required to arrive at the answer, AI wastes more time than it saves.

> AI is excellent at tasks which involve writing quasi-custom boilerplate (eg. tests which require a lot of mocking)

I wonder if anyone has compared how well the AI auto-generating approach works compared to meta programming approaches (like Lisp macros) meant to address the same kind of issues with repetitive code.

The generation of volumes of boiler plate takes effort; nobody likes to do it.

The problem is, that phase is not the full life cycle of the boiler plate.

You have to live with it afterward.

> 1) Once you get it to output something you like, do you check all the lines it changed? Is there a threshold after which you just... hope?

Not op but yes. It sometimes takes a lot of time but I read everything. It still faster than nothing. Also, I ask very precise changes to the AI so it doesn’t generate huge diffs anyway.

Also for new code, TDD works wonders with AI : let it write the unit tests (you still have to be mindful of what you want to implement) and ask it to implement the code that run the tests. Since you talk the probabilistic output, the tool is incredibly good at iterating over things (running and checking tests) and also, unit tests are, in themselves, a pretty perfect prompt.

> It sometimes takes a lot of time but I read everything. It still faster than nothing.

Opposite experience for me. It reliably fails at more involved tasks so that I don't even try anymore. Smaller tasks that are around a hundred lines maybe take me longer to review that I can just do it myself, even though it's mundane and boring.

The only time I found it useful is if I'm unfamiliar with a language or framework, where I'd have to spend a lot of time looking up how to do stuff, understand class structures etc. Then I just ask the AI and have to slowly step through everything anyways, but at least there's all the classes and methods that are relevant to my goal and I get to learn along the way.

How do you have it write tests before the code? It seems writing a prompt for the LLM to generate the tests would take the same time as writing the tests themselves.

Unless you're thinking of repetitive code I can't imagine the process (I'm not arguing, I'm just curious of what you're flow looks like).

> Is there a threshold after which you just... hope?

Generally, all the code I write is reviewed by humans, so commits need to be small and easily reviewable. I can't submit something I don't understand myself or I may piss off my colleagues, or it may never get reviewed.

Now if it was a personal project or something with low value, I would probably be more lenient but I think if you use a statically typed language, the type system + unit tests can capture a lot of issues so it may be ok to have local blocks that you don't look in details.

Yeah for me, I use AI with Rust and a suite of 1000 tests in my codebase. I also use CoPilot VS code plugin mostly, which as far as I can tell heavily weights toward local code around it and often it just writing code based on my other code. I've found AI to be a good macro debugger too, as macro debugging tools are severely lacking in most ecosystems.

But when I see people using these AI tools to write JavaScript of Python code wholesale from scratch, that's a huge question mark for me. Because how?? How are you sure that this thing works? How are you sure when you update it won't break? Indeed the answer seems to be "We don't know why it works, we can't tell you under which conditions it will break, we can't give you any performance guarantees because we didn't test or design for those, we can't give you any security guarantees because we don't know what security is and why that's important."

People forgot we're out here trying to do software engineering, not software generation. Eternal September is upon us.

1) Yes, I review every line it changed.

2) I find the tool analogy helpful but it has limits. Yes, it’s a stochastic tool, but in that sense it’s more like another mind, not a tool. And this mind is neither junior nor senior, but rather a savant.