I'd describe LLM usage as the winch and LLM avoidance as insisting on pushing it up hill without one.

Simon, I have mad respect for your work but I think on this your view might be skewed because your day to day work involves a codebase where a single developer can still hold the whole context in their head. I would argue that the inadequacies of LLMs become more evident the more you have to make changes to systems that evolve at the speed of 15+ concurrent developers.

One of the things I'm using LLMs for a lot right now is quickly generating answers about larger codebases I'm completely unfamiliar with.

Anything up to 250,000 tokens I pipe into GPT-5 (prior to that o3), and beyond that I'll send them to Gemini 2.5 Pro.

For even larger code than that I'll fire up Codex CLI or Claude Code and let them grep their way to an answer.

This stuff has gotten good enough now that I no longer get stuck when new tools lack decent documentation - I'll pipe in just the source code (filtered for .go or .rs or .c files or whatever) and generate comprehensive documentation for myself from scratch.

Don't you see how this opens up a blindspot in your view of the code?

You don't have the luxury of having someone who is deeply familiar with the code sanity check your perceived understanding of the code, i.e. you don't see where the LLM is horribly off-track because you don't have sufficient understanding of that code to see the error. In enterprise contexts this is very common tho so its quite likely that a lot of the haters here have seen PRs submitted by vibecoders to their own work which have been inadequate enough that they started to blame the tool. For example I have seen someone reinvent the wheel of the session handling by a client library because they were unaware that the existing session came batteries included and the LLM didn't hesitate to write the code again for them. The code worked, everything checked out but because the developer didn't know what they didn't know about they submitted a janky mess.

The LLMs go off track all the time. I spot that when I try putting what I've learned from them into action.

[deleted]

This just sounds 1:1 equivalent to "there are things LLMs are good for and things LLMs are bad for."

I'll bite.

What are those things that they are good for? And consistently so?

As someone who leans more towards the side of LLM-sceptiscism, I find Sonnet 4 quite useful for generating tests, provided I describe in enough detail how I want the tests to be structured and which cases should be tested. There's a lot of boilerplate code in tests and IMO because of that many developers make the mistake of DRYing out their test code so much that you can barely understand what is being tested anymore. With LLM test generation, I feel that this is no longer necessary.

Isn’t tests supposed to be premises (ensure initial state is correct), compute (run the code), and assertions (verify the result state and output). If your test code is complex, most of it should be moved into harness and helpers functions. Writing more complex code isn’t particularly useful.

I didn't say complex, I said long.

If you have complex objects and you're doing complex operations on them, then setup code can get rather long.