Maybe it's just the nature of being early adopters.

Other fields will get their turn once a baseline of best practices is established that the consultants can sell training for.

In the meantime, memes aside, I'm not too worried about being completely automated away.

These models are extremely unreliable when unsupervised.

It doesn't feel like that will change fundamentally with just incrementally better training.

> These models are extremely unreliable when unsupervised.

> It doesn't feel like that will change fundamentally with just incrementally better training.

I could list several things that I thought wouldn't get better with more training and then got better with more training. I don't have any hope left that LLMs will hit a wall soon.

Also, LLMs don't need to be better programmers than you are, they only need to be good enough.

No matter how much better they get, I don't see any actual sign of intelligence, do you?

There is a lot of handwaving around the definition of intelligence in this context, of course. My definition would be actual on the job learning and reliability i don't need to second guess every time.

I might be wrong, but those 2 requirements seem not compatible with current approach/hardware limitations.

Intelligence doesn't matter. To quote "Superintelligence: Paths, Dangers, Strategies":

> There is an important sense, however, in which chess-playing AI turned out to be a lesser triumph than many imagined it would be. It was once supposed, perhaps not unreasonably, that in order for a computer to play chess at grandmaster level, it would have to be endowed with a high degree of general intelligence.

The same thing might happen with LLMs and software engineering: LLMs will not be considered "intelligent" and software engineering will no longer be thought of as something requiring "actual intelligence".

Yes, current models can't replace software engineers. But they are getting better at it with every release. And they don't need to be as good as actual software engineers to replace them.

There is a reason chess was "solved" so fast. The game maps very nicely onto computers in general.

A grandmaster chess playing ai is not better at driving a car than my calculator from the 90s.

Yes, that's my point. AI doesn't need to be general to be useful. LLMs might replace software engineers without ever being "general intelligence".

Sorry for not making my point clear.

I'm arguing that the category of the problem matters a lot.

Chess is, compared to self-driving cars and (in my opinion) programming, very limited in its rules, the fixed board size and the lack of "fog of war".

I think I haven't made my point clear enough:

Chess was once thought to require general intelligence. Then computing power became cheap enough that using raw compute made computers better than humans. Computers didn't play chess in a very human-like way and there were a few years where you could still beat a computer by playing to its weaknesses. Now you'll never beat a computer at chess ever again.

Similarly, many software engineers think that writing software requires general intelligence. Then computing power became cheap enough that training LLMs became possible. Sure, LLMs don't think in a very human-like way: There are some tasks that are trivial for humans and where LLMs struggle but LLMs also outcompete your average software engineer in many other tasks. It's still possible to win against an LLM in an intelligence-off by playing to its weaknesses.

It doesn't matter that computers don't have general intelligence when they use raw compute to crush you in chess. And it won't matter that computers don't have general intelligence when they use raw compute to crush you at programming.

The proof that software development requires general intelligence is on you. I think the stuff most software engineers do daily doesn't. And I think LLMs will get continously better at it.

I certainly don't feel comfortable betting my professional future on software development for the coming decades.

"It is difficult to get a man to understand something when his salary depends upon his not understanding it" ~ Upton Sinclair

Your stance was the widely held stance not just on hacker news but also by the leading proponents of ai when chatgpt was first launched. A lot of people thought the hallucination aspect is something that simply can't be overcome. That LLMs were nothing but glorified stochastic parrots.

Well, things have changed quite dramatically lately. AI could plateau. But the pace at which it is improving is pretty scary.

Regardless of real "intelligence" or not.. the current reality is that AI can already do quite a lot of traditional software work. This wasn't even remotely true if if you were to go 6 months back.

How will this work exactly?

I think I have a pretty good idea of what AI can do for software engineering, because I use it for that nearly every day and I experiment with different models and IDEs.

The way that has worked for me is to make prompts very specific, to the point where the prompt itself would not be comprehensible to someone who's not in the field.

If you sat a rando with no CS background in front of Cursor, Windsurf or Claude code, what do you suppose would happen?

It seems really doubtful to me that overcoming that gap is "just more training", because it would require a qualitatively different sort of product.

And even if we came to a point where no technical knowledge of how software actually works was required, you would still need to be precise about the business logic in natural language. Now you're writing computer code in natural language that will read like legalese. At that point you've just invented a new programming language.

Now maybe you're thinking, I'll just prompt it with all my email, all my docs, everything I have for context and just ask it to please make my boss happy.

But the level of integrative intelligence, combined with specialized world knowledge required for that task is really very far away from what current models can do.

The most powerful way that I've found to conceptualize what LLMs do is that they execute routines from huge learnt banks of programs that re-combine stored textual information along common patterns.

They're cut and paste engines where the recombination rules are potentially quite complex programs learnt from data.

This view fits well with the strengths and weaknesses of LLMs - they are good at combining two well understood solutions into something new, even if vaguely described.

But they are quite bad at abstracting textual information into a more fundamental model of program and world state and reasoning at that level.

I strongly suspect this is intrinsic to their training, because doing this is simply not required to complete the vast majority of text that could realistically have ended up in training databases.

Executing a sophisticated cut&paste scheme is in some ways just too effective; the technical challenge is how do you pose a training problem to force a model to learn beyond that.

I just completed a prototype of a non-trivial product that was vibe-coded just to test the ability and limits of LLMs.

My experience aligns largely with your excellent comment.

>But the level of integrative intelligence, combined with specialized world >knowledge required for that task is really very far away from what current >models can do.

Where LLMs excel are to put out large templates of what is needed, but they are frayed at the edges. Imagine programming as a jigsaw puzzle where the pieces have to fit together. LLMs can align the broader pieces, but fail to fit them precisely.

>But they are quite bad at abstracting textual information into a more >fundamental model of program and world state and reasoning at that level.

The more fundamental model of program is a "theory" or "mental-model" which unfortunately is not codified in the training data. LLMs can put together broad outlines based on their training data, but lack the precision in modeling at a more abstract level. For example, how concurrency could impact memory access is not precisely understood by the LLM - since it lacks a theory of it.

> the technical challenge is how do you pose a training problem to force a model > to learn beyond that.

This is the main challenge - how can an LLM learn more abstract patterns. For example, in the towers of hanoi problem, can the LLM learn the recursion and what recursion means. This requires LLM to learn abstraction precisely. I suspect LLMs learn abstraction "fuzzily" but what is required is to learn abstraction "precisely". The precision or determinism is largely where there is still a huge gap.

LLM-boosters would point to the bitter lesson and say it is a matter of time before this happens, but I am a skeptic. I think the process of symbolism or abstraction is not yet understood enough to be formalized.

Ironic to post that quote about AI considering the hype is pretty much entirely from people who stand to make obscene wealth from it.

>That LLMs were nothing but glorified stochastic parrots.

Well yes , now we know they make kids kill themselves.

I think we've all fooled ourselves like this beetle

https://www.npr.org/sections/krulwich/2013/06/19/193493225/t...

for thousands of years up until 2020 anything that conversed with us could safely be assumed to be another sentient/intelligent being.

No we have something that does that, but is neither sentient or intelligent, just a (complex)deterministic mechanism.

Ive heard this described as a kind vs a wicked learning environment.

LLMs can code, but they can’t engineer IMO. They lack those other parts of the brain that are not the speech center.

[flagged]

Does it have to? Stack enough "it's 5% better" on top of each other and the exponent will crush you.

AI training costs are increasing around 3x annually across each of the last 8 years to achieve its performance improvements. Last year, spending across all labs was $150bn. Keeping the 3x trend means that, to keep pace with current advances, costs should rise to $450bn in 2025, $900bn in 2026, $2.7tn in 2027, $8.1tn in 2028, $25tn in 2028, and $75tn in 2029 and $225tn in 2030. For reference, the GDP of the world is around $125tn.

I think the labs will be crushed by the exponent on their costs faster white-collar work will be crushed by the 5% improvement exponent.

Be careful you're not confusing the costs of training an LLM and the spending from each firm. Much of that spending is on expanding access to older LLMs, building new infrastructure, and other costs.

That’s a fair criticism of my method, however model training costs are a significant cost centre for the labs. Modelling from there instead of from total expenditure only adds 2-3 years before model training costs are larger than the entire global economy.

Your math is a bit less than it should be because you doubled instead of trebled 2026

The current trained models are already pretty good enough for many things.

Is that so? Ok let the consumers decide - increase the price and let's see how many users are willing to pay the price.

They are mediocre plagiarism machines at best.

Are LLMs stackable? If they keep misunderstanding each other, it'll look more like successive applications of JPEG compression.

By all accounts, yes.

"Model collapse" is a popular idea among the people who know nothing about AI, but it doesn't seem to be happening in real world. Dataset quality estimation shows no data quality drop over time, despite the estimates of "AI contamination" trickling up over time. Some data quality estimates show weak inverse effects (dataset quality is rising over time a little?), which is a mindfuck.

The performance of frontier AI systems also keeps improving, which is entirely expected. So does price-performance. One of the most "automation-relevant" performance metrics is "ability to complete long tasks", and that shows vaguely exponential growth.

Given the number of academic papers about it, model collapse is a popular idea among the people who know a lot about AI as well.

Model collapse is something demonstrated when models are recursively trained largely or entirely on their own output. Given most training data is still generated or edited by humans or synthetic, I'm not entirely certain why one would expect to see evidence of model collapse happening right now, but to dismiss it as something that can't happen in the real world seems a bit premature.

We've found in what conditions does model collapse happen slower or fails to happen altogether. Basically all of them are met in real world datasets. I do not expect that to change.

The jpeg compression argument is still valid.

It's lossy compression at the core.

In 2025 you can add quality to jpegs. Your phone does it and you don't even notice. So the rhetorical metaphor employed holds up, in that AI is rapidly changing the fundamentals of how technology functions beyond our capacity to anticipate or keep up with it.

> add quality to jpegs

Define "quality", you can make an image subjectively more visually pleasing but you can't recover data that wasn't there in the first place

You can if you know what to fill from other sources.

Like, the grill of a car. If we know the make and year, we can add detail with each zoom by filling in from external sources

This is an especially bad example, a nice shiny grille is going to be strongly reflecting stuff that isn't already part of the image (and likely isn't covered well by adjacent pixels due the angle doubling of reflection).

Is this like how crypto changed finance and currency

I don't think it is.

Sure, you can view an LLM as a lossy compression of its dataset. But people who make the comparison are either trying to imply a fundamental deficiency, a performance ceiling, or trying to link it to information theory. And frankly, I don't see a lot of those "hardcore information theory in application to modern ML" discussions around.

The "fundamental deficiency/performance ceiling" argument I don't buy at all.

We already know that LLMs use high level abstractions to process data - very much unlike traditional compression algorithms. And we already know how to use tricks like RL to teach a model tricks that its dataset doesn't - which is where an awful lot of recent performance improvements is coming from.

Sure, you can upscale a badly compressed jpeg using ai into something better looking.

Often the results will be great.

Sometimes the hallucinated details will not match the expectations.

I think this applies fundamentally to all of the LLM applications.

And if you get that "sometimes" down to "rarely" and then "very rarely" you can replace a lot of expensive and inflexible humans with cheap and infinitely flexible computers.

That's pretty much what we're experiencing currently. Two years ago code generation by LLMs was usually horrible. Now it's generally pretty good.

I think you are selling yourself short if you believe you can be replaced by a next token predictor :)

I think humans who think they can't be replaced by a next token predictor think too highly of themselves.

LLMs show it plain and clear: there's no magic in human intelligence. Abstract thinking is nothing but fancy computation. It can be implemented in math and executed on a GPU.

LLMs have no ability to reason whatsoever.

They do have the ability to fool people and exacerbate or cause mental problems.

LLMs are actually pretty good at reasoning. They don't need to be perfect, humans aren't either.

what's actually happening is all your life you've been told by experience if something can talk to you is that it must be somewhat intelligent.

Now you get can't around that this might not be the case.

You're like that beetle going extinct mating with beer bottles.

https://www.npr.org/sections/krulwich/2013/06/19/193493225/t...

"What's actually happening" is all your life you've been told that human intelligence is magical and special and unique. And now it turns out that it isn't. Cue the coping.

We've already found that LLMs implement the very same type of abstract thinking as humans do. Even with mechanistic interpretability being in the gutters, you can probe LLMs and find some of the concepts they think in.

But, of course, denying that is much less uncomfortable than the alternative. Another one falls victim to AI effect.

> "What's actually happening" is all your life you've been told that human intelligence is magical and special and unique. And now it turns out that it isn't. Cue the coping.

People have been arguing this is not the case for at least hundreds of years.

Considering we don't understand consciousness at ALL or how humans think, you might want to backtrack your claims a bit.

Any abstraction you're noticing in an LLM is likely just a plagiarized one

Why isn't it then

I as a human being can of course not be replaced by a next token predictor.

But I as a chess player can easily be replaced by a chess engine and I as a programmer might soon be replaceable by a next token predictor.

The only reason programmers think they can't be replaced by a next token predictor is that programmers don't work that way. But chess players don't work like a chess engine either.

this boring reductionist take on how LLMs work is so outdated that I'm getting second hand embarassment.

Sorry, I meant a very fancy next token predictor :)

Lots of technology is cool if you get to just say “if we get rid of the limitations” while offering no practical way to do so.

It’s still horrible btw.

Hallucination has significantly decreased in the last two years.

I'm not saying that LLMs will positively replace all programmers next year, I'm saying that there is a lot of uncertainty and that I don't want that uncertainty in my career.

Pretty crazy, and all you have to do is assume exponential performance growth for as long as it takes.