I disagree. The quoted scenario is the absolute best for LLMs.

1) An easily defined go/no-go task with defined end point which requires

2) A bunch of programming code that nobody gives a single shit about

3) With esoteric function calls that require staring at obscure documentation

This is the LLM dream task.

When the next person has to stare at this code, they will throw it out and rerun an LLM on it because the code is irrelevant and the end task is the only thing that matters.

Here's the thing. Those first two things don't exist.

I'm revisiting this comment a lot with LLM's. I don't think many HN readers run into real life mudball/spaghetti code. I think there is a SV bias here where posters think taking a shortcut a few times is what a mudball is.

There will NEVER be a time in this business where the business is ok with simply scrapping these hundreds of inconsistent one off generations and be ok with something that sorta kinda worked like before. The very places that do this won't use consistent generation methods either. The next person to stare at it will not just rerun the LLM because at that time the ball will be so big not even the LLMs can fix it without breaking something else. Worse the new person won't even know what they don't know or even what to ask it to regenerate.

Man I'm gonna buy stock in the big three as a stealth long term counter LLM play.

I've seen outside of SV mudballs and they are messes that defy logical imagination. LLM's are only gonna make that worse. Its like giving children access to a functional tool shop. You are not gonna get a working product no matter how good the tools are.

A few cases just recently:

Someone in the company manages a TON of questionnaires. They type the questions into the service, get the results. The results are in an CSV format or some shit. Then they need to manually copy them to Google Sheets and do some adjustments on them.

Took me about 30 minutes of wall clock time, maybe 5 minutes of my time to have an LLM write me a simple python script that uses the API in the questionnaire service to pull down the data and insert it into a new Google Sheet.

Saves the person a TON of time every day.

---

Second case was a person who had to do similar manual input to crappy Sheets daily, because that's what the next piece in the process can read.

This person has a bit of an engineer mindset and vibe-coded a web tool themselves that has a UI that lets them easily fill the same information but view it in a more user friendly way. Then it'll export it in a CSV/JSON format for the next step in the process.

None of these would've been granted the day(s) of engineering time before, now both were something that could be thrown together quickly over a coffee break or done by themselves over a weekend.

I’d go further than the other reply: not only do those first two things definitely exist, they probably represent the plurality of programming tasks.

I generated a script today to diff 2 CSVs into a Venn diagram, ran it twice, then deleted the code.

The LLM itself could do have donne it, maybe you didn't need the code at all

It's a language model, not a compiler. Which is what people get wrong.

Ask one to count the 'r's in "strawberry" and it may or may not get it right.

Ask it to create a program to do it, it'll get it right instantly and it'll work.

When we get to a point where "AI" can write a program like that in the background, run it and use its result as a tool, we'll get the next big leap in efficiency.

I think the future of computing is ephemeral code like this, created rapidly on demand, then done when the immediate task is done.

> Here's the thing. Those first two things don't exist.

You are 100% wrong on this. They exist all the time when I'm doing a hardware task.

I need to test a new chip coming off the fab. I need to get the pins in the right place, the test code up and running, the jig positioned correctly, the test vectors for JTAG generated, etc.

This ... is ... a ... pain ... in ... the ... ass.

It changes every single time for every single chip. It changes for every jig and every JTAG and every new test machine. Nobody gives one iota of damn about the code as it will be completely different for the next revision of this chip. Once I validate that the chip actually powers on and does something sane, the folks who handle the real testing will generate real vectors--but not before.

So, generating that go/no-go code is in the way of everything. And nobody cares about what it looks like because it is going to be thrown out immediately afterward.