>Of course, you have to define what "has the same behavior as" means

And that's really my issue, for example when you define "has no trailing whitespace", you are basically writing a piece of the implementation. Cover all behaviors, and you have basically re-implemented the function, no?

In other words, if I have the full formal spec of f(), isn't that the same thing as having f()?

> In other words, if I have the full formal spec of f(), isn't that the same thing as having f()?

In some cases, however quite often, the spec is much simpler. For instance, it's easy to say that after running sort on some list, that the result is sorted. However, it is very hard to come up with the algorithm to do that from the specification. Sometimes that is even a point. Bubble sort, quick sort, tim sort, we can go on and on. There's a huge number of different sorts that computer science have discovered over the years. They all should have the same result and so you should be able to prove they do the same thing. However, in the real world there are often reasons you would prefer one to another despite all meeting the same spec.

Another obvious example are cryptographic hash functions: if you have a function f(s) = h, you can trivially specify a function inverse_f(h) = s st f(s) = h, and if you can infer a non-brute-force algorithm for that, you’ve just inferred a cryptographical weakness!

I did some of this stuff in college and what bugged me was that the spec actually ended up more complex than the code and it had bugs.

That was a long time ago and they said that formal methods were the future back then, too.

It's possible for that to happen but probably means either the function is too trivial or you're missing some abstraction in the spec

It's also possible it's not the future of programming.

It never was. It's useful but extremely slow.

This makes a lot of sense, thank you!

I think the key is that, while you may think you have a full formal spec of f(), you actually do not. You have a program written in some language, and that language has its own spec, and the language is compiled to asm which has its own spec, and the asm executes on an architecture that has its own spec, and so on.

So when you write a function like:

  func hypot(x, y):
    return sqrt(x*x + y*y)
You might think you have "fully specified" hypot, but this is far from true! You have said nothing about what registers will be used, for example. This is not a problem; quite the opposite. It's the whole point of using high-level languages: they let you focus on what you care about. A spec is just a program in a very-high-level language.

Exactly.

You can prove that f(a,b) always returns a+b.

And then you overflow the int on that machine.

Oops.

I am not sure that the perspective you have taken is the same as what I understood from the parent post; what I took from it is that things like registers, memory locations, ways to implement square root, and so on, are all _implementation choices_ that are not important properties of the specification. You specify that the hypotenuse is the square root of the sum of squares, but whether square root is implemented using Newtonian approximation or a fast inverse square root is irrelevant; whether your first argument is passed in a register or on the stack is irrelevant.

Often, things like resource usage are not specified: running time, memory consumption, etc, aren't relevant enough to appear in a behavioural specification.

If your spec says "f(a, b) returns a + b", but it's just a high-level document you can use to help guide your implementation, integer overflow is just one of many ways your implementation might be inconsistent with the specification. It's still likely that the existence of a formal specification you reference during implementation means that more edge cases have been considered ahead of time than if you just had an informal spec.

If, on the other hand, you prove it but it turns out not to be true (ie, you overflow integers), your proof is wrong. If a machine verified your proof and gave you a big thumbs up, your machine verification is wrong.

If, in Idris, I write "f : (a : Nat) -> (b : Nat) -> (c : Nat * c = a + b)", then I cannot compile an implementation for which I can't show a proof that the result is _always_ the addition of a and b, for all a and b, unbounded by anything but the resources at hand with which to run the program. An implementation subject to integer overflow won't compile.

Or, I could write "f : (a : Bits32) -> (b : Bits32) -> (c : Bits32 * c = a + b)" and implement something where , but then modulo arithmetic on overflow is _part of the specification_, because "+" in there is doing the heavy lifting of being defined as addition modulo 2*32 already; by specification, 4 billion plus 4 billion is ~3.7 billion.

Perhaps I have misunderstood or was unclear.

Everything the parent comment mentioned were implementation details that did not affect the correctness of the code.

I just wanted to point out that there are implementation details that DO affect the correctness of the code.

And, of course programs need to run on multiple architectures. So it's hard to do what people seemed to be talking about in this thread and verify code just from the source code.

If you have the luxury of proving the correctness of the CPU, compiler and OS, that should be a big win. Otherwise, it seems to just be another type of testing. Still useful, but calling it verified or proven seems a bit much.

From my perspective, it seems more to be writing another, more complicated program, with more opportunities for bugs, and seeing if the results agree

Well, ideally you proof should've failed because your axioms should be aware that arithmetic operators in the programming language of your choice are operating modulo some large power of two, and thus don't produce correct results for some particular inputs.

Admittedly, this one peeves a lot. Remember when Java's binary search and mergesort (and implementations in many other languages' standard libraries) turned out to have a bug of this kind [0]? Admittedly, the proof was informal, but if you are trying to prove some properties of a program X written in a certain programming language Y, you can't "just assume" that the semantics of Y are different from what they are, right? The integers do overflow in Java, that's explicitly stated in the Java's spec... and that means that a lot of even the most simplistic code has some very non-obvious correctness preconditions, which most of the times, I believe, are simply hoped to be true.

[0] https://research.google/blog/extra-extra-read-all-about-it-n...

Non trailing whitespace means the string doesn't end with a space. But foo is a function that converts an AST to a string, that's totally different. Or it's a function that loops until \0 and changes all spaces to +

The spec should be a summary of what the impl is supposed to do. You'd want more than just doesn't end with whitespace of course.

> Non trailing whitespace means the string doesn't end with a space

No. And this is a great example of the problems with specifications. You still have to write a spec. And this, too, is subject to bugs.

What's wrong with the statement above is there are 17 space characters in Unicode and another eight whitespace characters, like newline.

If you try to verify that something ends in whitespace, you have to make sure you have the right definition.

Not picking on parent poster! It's just a great example of the fact that you can verify, but if what you are verifying is wrong, it doesn't help.

> there are 17 space characters in Unicode and another eight whitespace characters, like newline.

And of course, those 25 characters don't include ZERO WIDTH {SPACE,NON-JOINER,JOINER,NON-BREAKING SPACE} and WORD JOINER, which gives you yet another 5 arguably "it's kinda space, right" codepoints which definitely should not be trailing in any reasonable text string.

Unicode never ceases to amaze me.

[deleted]