People roll out a complex and powerful technology without understanding the technology fully, what evals are or updating process to account for the tech, and the rollout fails, news at 11.

Seriously though, "AI fucks up" is a known thing (as is humans fuck up!) and the people who are using the tech successfully account for that and build guardrails into their systems. Use version control, build automated tests (e2e/stress, not just unit), update your process so you're not incentivizing dumb shit like employees dumping unchecked AI prs, etc.

If the tech only worked for coding it would be one thing. But it’s advertised as a cure for anything and everything and so people are using it for that. And you can’t build automated tests for that.

I am a big AI booster but I agree that using agents for tasks unsupervised without either rigorous oversight or strong automated constraints is a mistake.

Imagine comparing a human fuck up to an AI one. Lol.

I have seen 30 years of human fuckups, it is infinitely worse than AI fuck ups so you are right, cannot be compared, humans are so much worse

there are roughly 2.09% of SWEs that actually know what they are doing so this 97.91% generally prodces garbage (after 30 years doing this shit I have once experiencing being brought it to a project (I have been working as a consultant for a long time now) and went “wow, now this is beautiful codebase!”

You have to look at where the bullet holes aren't on surviving planes to know where to reinforce them...

aka you don't maybe think tha - as an outside consultant - the nature of the job means you'd rarely be brought in to fix "beautiful codebases"...?

certainly! but you see so much you stay in the industry long enough. and hear other people’s stories. the most common one - “just got a new gig at ____, wow the codebase is a mess.”

I probably worked with 300-400 SWEs directly and of them there is only one I’d trust to write code if my life depended on it. and I think that is likely in-line with how many SWEs are actually great at their jobs

it's been grinding my gears so much lately that people keep trying to compare "blurry jpeg machines" to human intelligence and development.

llms don't learn. nor do they operate with any sort of intent towards precision.

we can develop around, plan for and predict most common human errors. also, humans typically get smarter and learn from their mistakes.

llms will go on making the same ridiculous mistakes, confidently making up bullshit frameworks methods and code, and no matter how much correction you try to offer, they will never get any better until the next multi-billion dollar model update. and even then, it's more of a crossed finger situation than an inevitability improvement and growth.

I hate hate hate hate hate that AI seems to be increasing Dunning Krueger's effect on all our lives...