I will quote (approximately) Simon Peyton Jones: "writing programs is not hard. Specifying what a program should do is hard."

Prove what? The author is well versed in the problem of specifications as the aliens-demand-gcc-correctness-or-else post. I also enjoyed the other post where they say "no one cares about correctness" and "people care about compliance".

It is safe to say, most people are not experts and will never become experts.

Being proficient in the business of coming up with suitable specification that can be implemented and getting assurance that an implementation meets the spec will most likely need all the kind of formal training for humans that the AI hype industry assures us is no longer necessary.

So it doesn't much help that LLMs are good at manipulating the tokens when used by experts, at least not in a big way. It can be hoped that they change cost-benefit balance in good ways, but the bigger problems of the industry will simply persist.

What would need to change for a more fundamental shift is getting a lot more people to understand the value of specifications. This is an education problem and paradoxically I could see AI helping a lot... if only there was enough interest in using AI to help humans become experts.

This change is happening as we speak. My work repo is now sprouting with markdowns primarily intended for agentic consumption, but it’s all dense jargon communicating requirements and important design decisions in a way which is useful to me as a software engineer. The toil of writing these specs from low level code is much lower today, because LLMs also help with that immensely.

In my workplace happens this, but in a bad way. There's a push to make use of AI as much as we can to "boost productivity", and the one thing people don't want to do is write documentation. So what ends up happening is that we end up with a bunch of AI documentation that other AIs consume but humans have a harder time following because of the volume of fluff and AI-isms. Shitty documentation still exists and can be worse than before...

Other than humans getting apoplectic at the word "delve" and — emdashes, can you explain and give some examples or say more about how AI-isms hurt readability?

Having encountered this spread across our orgs greenfield codebases which made heavy use of AI in the last 90 days: Restating the same information in slightly different formats, with slightly different levels of detail in several places, in a way that is unnecessary. Like a "get up and running quickly" guide in the documentation which has far more detail than the section it's supposed to be summarizing. Jarringly inconsistent ways of providing information within a given section (a list of endpoints and their purposes, followed by a table of other endpoints, followed by another list of endpoints). Unnecessary bulleted lists all over the places which could read more clearly as single sentences or a short paragraph. Disembodied documentation files nested in the repos that restate the contents of the README, but in a slightly different format/voice. Thousands of single line code comments that just restate what is already clear to the reader if they just read the line it's commenting on. That's before getting into any code quality issues themselves.

I've noticed AI generated docs frequently contain bulleted or numbered lists of trivialities, like file names - AI loves describing "architecture" by listing files with a 5 word summary of what they do which is probably not much more informative than the file name. Superficially it looks like it might be useful, but it doesn't contribute any actually useful context and has very low information density.

[dead]

A piece of information, or the answer to a question, could exist in the documentation but is not in a format that's easily readable to humans. You ask the AI to add certain information, and it responds with "I already added it". But the AI doesn't "read" documents the way humans read.

For instance, say you need urgent actions from other teams. To this effect you order an AI to write a document and you give it information. The AI produces a document following it's own standard document format with the characteristic AI fluff. But this won't work well, because upon seeing the urgent call for action the teams will rush to understand what they need to do, and they will be greeted by a corporate-pr-sounding document that does not address their urgent needs first and foremost.

Yes, you could tell the AI how to make the document little by little... but at that point you might as well write it manually.

There is a place for informal, prose specs, and I can easily agree that more people are nowadays describing their programs in English.

The context here is formal specs though - adequately and precisely capturing the intended meaning (semantics) in a way that lends itself to formal verification.

Interactive theorem proving is interactive because proof search is intractable; a human articulates the property they want to prove and then performs the "search". Apart from the difficulties of getting to that proof, it can also happen that all goes through and you realize that the property is not exactly what you wanted.

I’m curious if your markdowns are discernible by other engineers? Or are they purely only discernible by you and an LLM?

I think you're missing a more internecine debate within software correctness/formal methods circles than the more general one about "writing good/correct software". Deductive theorem proving was all the rage in the 1970s, to the point that prominent people (Tony Hoare, Dijkstra) believed that it's the only way to write correct software. Over the years, the focus has shifted to other methods [1], from model-checking to unsound methods such as concolic testing and even property-based testing and code reviews, which have proven more effective than the deductive proof proponents had envisioned (Tony Hoare admitted his mistake in the nineties).

The problem with deductive proofs is that it's all-or-nothing, i.e. either you prove the theorem or you don't, and it's much harder to turn a knob to trade off cost for confidence and vice-versa than with other methods. Consequently, theorem proving has not benefitted from automation as much as it's alternatives, and is now viewed as having a too low bang-for-the-buck. But if LLMs can help automate deductive theorem proving, maybe that particular approach, which has gradually become less popular within formal methods circles but still has its fans, could make a comeback.

Of coure, specification is equally important in all methods, but the particular method of theorem proving has been disadvantaged compared to other formal and semi-formal methods, and I think that's what the author wishes could change.

[1]: Deductive theorem proving may be more prominent on HN than other formal methods, but as with most things, HN is more representative of things that spark people's imagination than of things that work well. I'm not counting that as a strike against HN, but it's something to be aware of. HN is more Vogue/GQ than the Times, i.e. it's more news about what people wish they were doing than what they're actually doing.

> The problem with deductive proofs is that it's all-or-nothing, i.e. either you prove the theorem or you don't, and it's much harder to turn a knob to trade off cost for confidence and vice-versa than with other methods.

This is not really true though. Sound type checking is a kind of deductive proof, but it's practically usable because it establishes very simple properties and is generally "local", i.e. it follows the same structure as program syntax.

Well, that's true for simple type systems. Dependent types are not so practically usable. You're absolutely right that simple types prove some "simple" properties, but we know what characterises them: they're inductive (or composable) invariants, i.e. P(a ∘ b) ⇔ P(a) ∧ P(b), for some program composition operator ∘ (I'm papering over some important but nuanced details). It's just that most program correctness properties we're interested in are not inductive, so the "proving power" of simple types is not strong (another way of describing it is that it's not powerful enough to express propositions with interleaved universal and existential quantifiers).

So when we talk about formal methods, we're usually interested in those that can express and/or "verify" (to some degree of confidence), non-composable properties and propositions that involve interleaved quantifiers.

[deleted]

>writing programs is not hard. Specifying what a program should do is hard.

But these two things are kinda the same thing. Writing a program is mostly just specifying what it should do in a very specific, unambiguous way. That’s why we have programming languages and don’t use English.

A PRD, a design doc and the code itself are kind of all the same thing, but with increasing levels of specification.

If you’ve ever played with formal specifications at some point you probably had the realization “wait a minute, I’m just writing another implementation of the program!” and in a way you are.

Spoken language descriptions of how to do something are incredibly underdetermined. Many people are likely familiar with the meme of following exact instructions on making a PB&J where a phrase like “put a knife in your hand and spread the peanut butter” can result in stabbing yourself and rubbing peanut butter on the wound. More than half of my job is helping other people know or otherwise determining what they really want a program to do in the first place.

This sounds true but isn't.

"The program should accept one parameter as a number, and output this number multiplied by two" is a spec.

  input a

  print a*2
is a program.

But: why do that? why multiply by two? what does two represent? what do the input and output mean in the real world? what part is likely to change in the near future? etc.

Wrting the program isn't harder than writing the spec. Analyzing the problem and coming up with solutions is the hard part.

(And yes, LLMs are getting better at this everyday, but I think they still have a very long way to go.)

[deleted]

You read like Alan Kay's Quora and for that I salute you comrade.