Exactly, and people have been saying this for a while now. If an "AI software engineer" needs a perfect spec with zero ambiguity, all edge cases defined, full test coverage with desired outcomes etc., then the person writing the spec is the actual software engineer, and the AI is just a compiler.
We’ve also learned that starting off by rigidly defined spec is actually harmful to most user facing software, since customers change their minds so often and have a hard time knowing what they want right from the start.
This is why most of the best software is written by people writing things for themselves and most of the worst is made by people making software they don't use themselves.
True facts: half of the self made software are task trackers.
Sure, and the most performed song in the world is probably hot cross buns or Mary had a little lamb.
Exactly. This is what I tell everyone. The harder you work on specs the easier it gets in the aftermath. And this is exactly what business with lofty goals doesn’t get or ignores. Put another way: a fool with a tool…
Also look out for optimization the clever way.
This is not quite right - a specification is not equivalent to writing software, and the code generator is not just a compiler - in fact, generating implementations from specifications is a pretty active area of research (a simpler problem is the problem of generating a configuration that satisfies some specification, "configuration synthesis").
In general, implementations can be vastly more complicated than even a complicated spec (e.g. by having to deal with real-world network failures, etc.), whereas a spec needs only to describe the expected behavior.
In this context, this is actually super useful, since defining the problem (writing a spec) is usually easier than solving the problem (writing an implementation); it's not just translating (compiling), and the engineer is now thinking at a higher level of abstraction (what do I want it to do vs. how do I do it).
Surely a well written spec would include functional requirements like resilience and performance?
However I agree that's the hard part. I can write a spec for finding the optimal solution to some combinatorial problem - where the naive code is trivial - a simple recursive function for example - but such a function would use near infinite time and memory.
In terms of the ML programme really being a compiler - isn't that in the end true - the ML model is a computer programme taking a spec as input and generating code as output. Sounds like a compiler to me.
I think the point of the AK post is to say the challenge is in the judging of solutions - not the bit in the middle.
So to take the writing software problem - if we had already sorted the computer programme validation problem there wouldn't be any bugs right now - irrespective of how the code was generated.
The point was specifically that that obvious intuition is wrong, or at best incomplete and simplistic.
You haven't disproved this idea, merely re-stated the default obvious intuition that everyone is expected to have before being presented with this idea.
Their point is correct that defining a spec rigorously enough IS the actual engineering work.
A c or go program is nothing else but a spec which the compiler impliments.
There are infinite ways to impliment a given c expression in assembly, and doing that is engineering and requires a human to do it, but only once. The compiler doesn't invent how to do it every time the way a human would, the compiler author picked a way and now the compiler does that every time.
And it gets more complex where there isn't just one way to do things but several and the compiler actually chooses from many methods best fit in different contexts, but all of that logic is also written by some engineer one time.
But now that IS what happens, the compiler does it.
A software engineer no longer writes in assembly, they write in c or go or whatever.
I say I want a function that accepts a couple arguments and returns a result of a math formula, and it just happens. I have no idea how the machine actually impliments it, I just wrote a line of algebra in a particular formal style. It could have come right out of a pure math textbook and the valid c function definition syntax could just as well be pseudocode to describe a pure math idea.
If you tell an ai, or a human programmer for that matter, what you want in a rigorous enough format that all questions are answered, such that it doesn't matter what language the programmer uses or how the programmer impliments it, then you my friend have written the program, and are the programmer. The ai, or the human who translated that into some other language were indeed just the compiler.
It doesn't matter that there are multiple ways to impliment the idea.
It's true that one programmer writes a very inefficient loop that walks an entire array once for every element in the array, while another comes up with some more sophisticated index or vector or math trick approach, but that's not the definition of anything.
There are both simple and sophisticated compilers. You can already right now feed the the same c code into different compilers and get results that all work, but one is 100x faster than another, one uses 100x less ram than another, etc.
If you give a high level imprecise directive to an ai, you are not programming. If you give a high level precise directive to an ai, you are programming.
The language doesn't matter. What matters is what you express.
What makes you think they'll need a perfect spec?
Why do you think they would need a more defined spec than a human?
A human has the ability to contact the PM and say, "This won't work, for $reason," or, "This is going to look really bad in $edgeCase, here are a couple options I've thought of."
There's nothing about AI that makes such operations intrinsically impossible, but they require much more than just the ability to generate working code.
A human needs a perfect spec too.
Anything you don't define, is literally undefined behavior the same as in a compiler. The human will do something, and maybe you like it and maybe you don't.
A perfect spec is just another way to dedcribe a formal language, ie any programming language.
If you don't care what you get, then say little and say it ambiguously and pull the slot machine lever.
If you care what you get then you don't necessarily have to say a lot but you have to remove ambiguity, and then what you have is a spec, and if it's rigorous enough, it's a program, regardless what language and syntax is used to express it.
I think the difference is that with a human you can say something ambiguous like "handle error cases" and they are going to put thought into the errors that come up. The LLM will just translate those tokens into if statements that do some validation and check return values after calls. The depth of thought is very different.
But that is just a difference of degree, not of kind.
There is a difference between a human and an ai, and it is more than a difference of degrree, but filling in gaps with something that fits is not very significant. That can be done perfectly mechanistically.
Reminds me of when computers were literally humans computing things (often women). How time weaves its circular web.
> then the person writing the spec is the actual software engineer
Sounds like this work would involve asking questions to collaborators, guess some missing answers, write specs and repeat. Not that far ahead of the current sota of AI...
Same reason the visual programming paradigm failed, tbe main problem is not the code.
While writing simple functions may be mechanistic, being a developer is not.
'guess some missing answers' is why Waterfall, or any big upfront design has failed.
People aren't simply loading pig iron into rail cars like Taylor assumed.
The assumption of perfect central design with perfect knowledge and perfect execution simply doesn't work for systems which are for more like an organism than a machine.
Waterfall fails when domain knowledge is missing. Engineers won't take "obvious" problems into consideration when they don't even know what the right questions to ask are. When a system gets rebuild for the 3rd time the engineers do know what to build and those basic mistakes don't get made.
Next gen LLMs, with their encyclopedic knowledge about the world, won't have that problem. They'll get the design correct on their first attempt because they're already familiar with the common pitfalls.
Of course we shouldn't expect LLMs to be a magic bullet that can program anything. But if your frame of reference is "visual programming" where the goal is to turn poorly thought out requirements into a reasonably sensible state machine then we should expect LLMs to get very good at that compared to regular people.
LLMs are NLP, what you are talking about is NLU, which has been considered an AI-hard problem for a long time.
I keep looking for discoveries that show any movement there. But LLMs are still basically pattern matching and finding.
They can do impressive things, but they actually have no concept of what the 'right thing' even is, it is statistic not philosophy.
I mean, that's already the case in many places, the senior engineer / team lead gathering requirements and making architecture decisions is removing enough ambiguity to hand it off to juniors churning out the code. This just makes very cheap, very fast typing but uncreative and a little dull junior developers.