Many managers are more bullish on AI and less able to recognize slop, they are unlikely to recognize quality crisis. And they are the people who decide who belong to the industry and who is not. As a result we will get an escalation of enshittification and people will start to forget that slop is not the only option.

Yeah, I keep coming back to a point that the way people talk about AI is still entirely disconnected from what it can actually do. I think of the bell curve meme a lot when I see people talking about AI. the people most bullish to perpetuate that it's going to take over are people that have vested interested, or people that are fall on the bottom half of the bell curve. I mean ... come on, by design an AI is literally a statistical averaging of all the data it's seen. AI is extremely average at nearly everything it does. If you find yourself using AI and it's doing something amazing, that speaks more to your knowledge/ability about a subject more than it speaks about AIs ability

I mean, I guess if all you do is work on implementing CRUD endpoints ... sure I guess you're cooked. but we had tech to automate this already, this isn't anything new. But oh man, if you're doing real engineering, the tools are barely usable.

I hate when people don't give examples, so I am going to throw one here. just the other day, I asked the newest and most expensive claude model to write an LRU and to have a running tally of the capacity of bytes in the cache as the threshold to evict something from the cache. It wrongly implemented the threshold checks and just tracked how many elements were in the cache. this might sound small, but scale that mistake up to a real production system. this is literally unusable. and the expectation to sit there, have it generate 1000s of lines of code for you, and then spot check that small but huge error is not worth it. you have to move so slow to spot check everything - to the point that it's literally faster to type it. This is a model that costs $100s to run per hour and is advertised as "PHD level intelligence" making High school AP computer science to freshman computer science errors - like come on.

If you're reading this, are an expert in your field, and are actually worried about your job - you got be able to have some mental fortitude and not fall for this ...

after the implementation was done, have you asked the model again, in a fresh context window, to review the code against the specification?

so much to unpack here and almost poetic that you say this

first is that the model will write out that it “thought” and “double checked” it’s output

Second, this was in a fresh context window of the latest model (that isn’t fable b/c we can’t use for reasons beyond this thread), and it was on it’s second highest thinking mode. I shouldn’t have to double check something that it claimed to have burned more tokens on to double check

Outside of it costing me more money to fix what it claims to do, the main point of this article is that models are implementing things nearly end to end, and if we scale it up, it will only continue to do that. I Intentionally chose the example of something that is < 70 lines to implement in TS (btw, the language with the second most amount of data available to scrape and train on) I would assume a machine that can almost implement things end to end should be able to implement something of 70 lines of code and has been documented for nearly 50 years.

My point is that time and time again on the most trivial examples, under the best of conditions, and with unlimited amounts of money, they can’t do what it claims

Outside of that, this follow up comment(s) that say, “oh you need to ask it to check its own work and be so involved in the process of it writing the code that you need to spot check it” goes against everything the article states

The best analogy I have for this is New speak in 1984, it’s just vibes dictating vibes and trying to make people claim that the vibes are right. and if you try to validate the vibes, your vibes are just wrong because you don’t get the vibes. The claims that it made have no data backing it. And if there is data, it’s cherry picked. Please use your brain and stop outsourcing your ability to think to a machine that is incorrectly thinking on your behalf

Edit: Typos

Where you see quality crisis I see job security! Honest question, when it comes to enshittification of software quality.. have you ever had to use a Meta framework? How many times have they rewritten their mobile apps to use some architect's bespoke code pipe dream? The quality crisis has always been here, now there's just more of it.