Man... I spent the last 6 months writing code using voice chat with multiple concurrent Claude code agents using an orchestration system because I felt like that was the new required skill set.

In the past few weeks I've started opening neovim again and just writing code. It's still 50/50 with a Claude code instance, but fuck I don't feel a big productivity difference.

I just write my own code and then ask AI to find any issues and correct them if I feel it is good advice. What AI is amazing at is writing most of my test cases. Saves me a lot of time.

I've seen tests doing:

a = 1

assert a == 1

// many lines here where a is never used

assert a == 1

Yes AI test cases are awesome until you read what it's doing.

To be fair, many human tests I've read do similar.

Especially when folks are trying to push % based test metrics and have types ( and thus they tests assert types where the types can't really be wrong ).

I use AI to write tests. Many of them the e2e fell into the pointless niche, but I was able to scope my API tests well enough to get very high hit rate.

The value of said API tests aren't unlimited. If I had to hand roll them, I'm not sure I would have written as many, but they test a multitude of 400, 401, 402, 403, and 404s, and the tests themselves have absolutely caught issues such as validator not mounting correctly, or the wrong error status code due to check ordering.

It's good at writing/updating tedious test cases and fixtures when you're directing it more closely. But yes, it's not as great at coming up with what to test in the first place.

I write assert(a==1) right before the line where a is assumed to be 1 (to skip a division by a) even if I know it's 1. Especially if I know it's 1!

Yep. Especially for tests with mock data covering all sorts of extreme edge cases.

Don't use AI for that, it doesn't know what your real data looks like.

lol. okay. neither do you.

Majority of data in typical message-passing plumbing code are a combination of opaque IDs, nominal strings, few enums, and floats. It's mostly OK for these cases, I have found. Esp. in typed languages.

There has always been a difference on modality and substance.

This is the same thing as picking a new smart programming language or package, or insisting that Dvorak layout is the only real way forward.

Personally I try to put as much distance to the modality discussion and get intimate with the substance.

> voice chat ... required skill set

But we're still required to go to the office, and talking to a computer on the open space is highly unwelcome

Right. If AI actually made you more productive, there would be more good software around, and we wouldn't have the METR study showing it makes you 20% slower.

AI delivers the feeling of productivity and the ability to make endless PoCs. For some tasks it's actually good, of course, but writing high quality software by itself isn't one.

Ah, yes. LLM-assisted development. That thing that is not at all changing, that thing that different people aren’t doing differently, and that thing that some people aren’t definitely way better at than others. I swear that some supposedly “smart” people on this website throw their ability to think critically out the window when they want to weigh in on the AI culture war. B-but the study!

I can way with certainty that: 1. LLM-assisted development has gotten significantly, materially better in the past 12 months.

2. I would be incredibly skeptical of any study that’s been designed, executed, analysed, written about, published, snd talked about here, within that period of time.

This is the equivalent of a news headline stating with “science says…”.

Nobody is interested in your piece of anecdata and asserting that something has gotten better without doing any studies on it, is the exact opposite of critical thinking.

You are displaying the exact same thing that you were complaining about.

Really? The past two weeks I've been writing code with AI and feel a massive productivity difference, I ended up with 22k loc, which is probably around as many I'd have manually written for the featureset at hand, except it would have taken me months.

My work involves fixing/adding stuff in legacy systems. Most of the solutions AI comes up with are horrible. I've reverted back to putting problems on my whiteboard and just letting it percolate. I still let AI write most of the code once I know what I want. But I've stopped delegating any decision making to it.

Ah, yeah, I can see that. It's not as good with legacy systems, I've found.

Well at least for what I do, success depends on having lots of unit tests to lean on, regardless of whether it is new or existing code. AI plus a hallucination-free feedback loop has been a huge productivity boost for me, personally. Plus it’s an incentive to make lots of good tests (which AI is also good at)