I am not sure that the perspective you have taken is the same as what I understood from the parent post; what I took from it is that things like registers, memory locations, ways to implement square root, and so on, are all _implementation choices_ that are not important properties of the specification. You specify that the hypotenuse is the square root of the sum of squares, but whether square root is implemented using Newtonian approximation or a fast inverse square root is irrelevant; whether your first argument is passed in a register or on the stack is irrelevant.

Often, things like resource usage are not specified: running time, memory consumption, etc, aren't relevant enough to appear in a behavioural specification.

If your spec says "f(a, b) returns a + b", but it's just a high-level document you can use to help guide your implementation, integer overflow is just one of many ways your implementation might be inconsistent with the specification. It's still likely that the existence of a formal specification you reference during implementation means that more edge cases have been considered ahead of time than if you just had an informal spec.

If, on the other hand, you prove it but it turns out not to be true (ie, you overflow integers), your proof is wrong. If a machine verified your proof and gave you a big thumbs up, your machine verification is wrong.

If, in Idris, I write "f : (a : Nat) -> (b : Nat) -> (c : Nat * c = a + b)", then I cannot compile an implementation for which I can't show a proof that the result is _always_ the addition of a and b, for all a and b, unbounded by anything but the resources at hand with which to run the program. An implementation subject to integer overflow won't compile.

Or, I could write "f : (a : Bits32) -> (b : Bits32) -> (c : Bits32 * c = a + b)" and implement something where , but then modulo arithmetic on overflow is _part of the specification_, because "+" in there is doing the heavy lifting of being defined as addition modulo 2*32 already; by specification, 4 billion plus 4 billion is ~3.7 billion.

Perhaps I have misunderstood or was unclear.

Everything the parent comment mentioned were implementation details that did not affect the correctness of the code.

I just wanted to point out that there are implementation details that DO affect the correctness of the code.

And, of course programs need to run on multiple architectures. So it's hard to do what people seemed to be talking about in this thread and verify code just from the source code.

If you have the luxury of proving the correctness of the CPU, compiler and OS, that should be a big win. Otherwise, it seems to just be another type of testing. Still useful, but calling it verified or proven seems a bit much.

From my perspective, it seems more to be writing another, more complicated program, with more opportunities for bugs, and seeing if the results agree