I've started to realize after poring over pull requests which are, frankly, slop that the devs who are the most bullish on AI are the ones who raise those PRs and don't recognize the slop.
AI for sure is giving all of them existential crises but I'm not sure most of them ever really belonged in the industry in the first place.
I give it 9-12 months before they start to realize that acknowledgement of this existential crisis is at its core, acknowledgement of of a skill issue.
People have built entire careers shipping garbage, now they can ship 10x more garbage and to them that’s progress.
I’ve tried to explain this to folks as “having taste”, but I’m always worried it comes off as subjective and snobby. It might be a fair assessment honestly, it’s hard for me to describe so I wouldn’t hold anyone to it as a standard. Give me an honest vibe check on that.
Theres a lot of codebases out there that are at odds with my own opinions about syntax/structure/purpose, but there’s evidence of “taste” that I absolutely respect. I can look at a couple modules, and have a good idea what the other modules are going to be like, because the mental model of the author is clear from the code itself. Even teams with multiple authors with taste average out to one taste-profile and in a similar way, I’ve seen LLM output shaped by someone with taste and had the same feeling: “yeah I see the direction you’re going in”.
Someone without taste using an LLM writes slop. I can’t tell what you’re doing. Any question about what you’re doing results in “sorry that was Claude”. Entirely pointless that you’re even involved.
It’s a property of the author IMO. They were kind of owed an existential crisis as cruel as that is to say.
Was recognized long before AI came on the scene.
The only problem with Microsoft is they just have no taste. They have absolutely no taste. And I don't mean that in a small way, I mean that in a big way, in the sense that they don't think of original ideas, and they don't bring much culture into their products.
Steve Jobs
HAH! Ok fair, maybe parts of that quote are rattling around somewhere in my mind.
giving a shit == having taste.
giving a shit != perfect.
i recently had a PR which had a comment explaining a change of an import: "// Changed imports to add Foo as it's needed for updated bar()".
apparently the person behind "it" has been a developer for 10 years. couldn't be bothered to remove completely useless "how" comments from a 25 line change (without all the useless comments).
also, i posted on another AI slop thread about taste: https://news.ycombinator.com/item?id=48515463
It was never about the code, all that really matters is does it work.
In that light I’m not happy about it, but the code always was just a means to an end.
I think it's likely that what you call slop is more often than not "good enough".
One thing a lot of developers aim for in their code, beyond "it does what it is meant to", is something along the lines of elegance (that's my word for it, there may be a better one).
With AI generated code there is no time for elegance. It will happily recreate the same function in several different places for no reason. And that really doesn't matter anymore.
Said another way: AI generated code doesn't chase perfection. It just chases good enough.
But it's not good enough. We can see this all over industry where even M$ is producing software so bad even calculator is electron app. Slow, poor quality and for any engineer below acceptable
Producing things that aren't good enough has never stopped companies from becoming multibillion dollar entities. I've been waiting for 40 years for good software to take over the market... and I'm still waiting.
>> It will happily recreate the same function in several different places for no reason
So do many developers. I've lost count how many times a code review had to be rejected or cleaned up because of copy and pasted code and I'm going to admit, sometimes it's just quicker to duplicate a little code and leave a comment for 'next time'.. we've all done it.
.. like this one time I had a PR and the developer created on loooong linear method, couldn't figure out how to share between targets and copied and pasted the same bad code somewhere else. Somehow it got through and when asked why this was on production the answer was 'it worked'.
>> no time for elegance
This happens, your experience in is generally your quality out. But that doesn't necessarily mean there's going to be elegance. I've worked at major product driven companies where elegance took a back seat to getting release out the door.
Many managers are more bullish on AI and less able to recognize slop, they are unlikely to recognize quality crisis. And they are the people who decide who belong to the industry and who is not. As a result we will get an escalation of enshittification and people will start to forget that slop is not the only option.
Yeah, I keep coming back to a point that the way people talk about AI is still entirely disconnected from what it can actually do. I think of the bell curve meme a lot when I see people talking about AI. the people most bullish to perpetuate that it's going to take over are people that have vested interested, or people that are fall on the bottom half of the bell curve. I mean ... come on, by design an AI is literally a statistical averaging of all the data it's seen. AI is extremely average at nearly everything it does. If you find yourself using AI and it's doing something amazing, that speaks more to your knowledge/ability about a subject more than it speaks about AIs ability
I mean, I guess if all you do is work on implementing CRUD endpoints ... sure I guess you're cooked. but we had tech to automate this already, this isn't anything new. But oh man, if you're doing real engineering, the tools are barely usable.
I hate when people don't give examples, so I am going to throw one here. just the other day, I asked the newest and most expensive claude model to write an LRU and to have a running tally of the capacity of bytes in the cache as the threshold to evict something from the cache. It wrongly implemented the threshold checks and just tracked how many elements were in the cache. this might sound small, but scale that mistake up to a real production system. this is literally unusable. and the expectation to sit there, have it generate 1000s of lines of code for you, and then spot check that small but huge error is not worth it. you have to move so slow to spot check everything - to the point that it's literally faster to type it. This is a model that costs $100s to run per hour and is advertised as "PHD level intelligence" making High school AP computer science to freshman computer science errors - like come on.
If you're reading this, are an expert in your field, and are actually worried about your job - you got be able to have some mental fortitude and not fall for this ...
after the implementation was done, have you asked the model again, in a fresh context window, to review the code against the specification?
so much to unpack here and almost poetic that you say this
first is that the model will write out that it “thought” and “double checked” it’s output
Second, this was in a fresh context window of the latest model (that isn’t fable b/c we can’t use for reasons beyond this thread), and it was on it’s second highest thinking mode. I shouldn’t have to double check something that it claimed to have burned more tokens on to double check
Outside of it costing me more money to fix what it claims to do, the main point of this article is that models are implementing things nearly end to end, and if we scale it up, it will only continue to do that. I Intentionally chose the example of something that is < 70 lines to implement in TS (btw, the language with the second most amount of data available to scrape and train on) I would assume a machine that can almost implement things end to end should be able to implement something of 70 lines of code and has been documented for nearly 50 years.
My point is that time and time again on the most trivial examples, under the best of conditions, and with unlimited amounts of money, they can’t do what it claims
Outside of that, this follow up comment(s) that say, “oh you need to ask it to check its own work and be so involved in the process of it writing the code that you need to spot check it” goes against everything the article states
The best analogy I have for this is New speak in 1984, it’s just vibes dictating vibes and trying to make people claim that the vibes are right. and if you try to validate the vibes, your vibes are just wrong because you don’t get the vibes. The claims that it made have no data backing it. And if there is data, it’s cherry picked. Please use your brain and stop outsourcing your ability to think to a machine that is incorrectly thinking on your behalf
Edit: Typos
Where you see quality crisis I see job security! Honest question, when it comes to enshittification of software quality.. have you ever had to use a Meta framework? How many times have they rewritten their mobile apps to use some architect's bespoke code pipe dream? The quality crisis has always been here, now there's just more of it.
> I give it 9-12 months
That's a wildly optimistic take.
Most of them will never realize it by themselves, and will put the blame of people reacting badly to their work on the people complaining, not themselves.