> Up to and including requiring proof that a function's domain is respected

Does such a proof exist in this context? Or are we writing fanfiction about the problem domain now?

> It's not "unrecoverable"

Yes it is. The correct behaviour in that context is to terminate the program, which makes it unrecoverable.

> Does such a proof exist in this context? Or are we writing fanfiction about the problem domain now?

"Often HPC codebases will use numerical schemes which are mathematically proven to 'blow up' (produce infs/nans) under certain conditions even with a perfect implementation"

Yes? Function domains are trivially enforceable through the type system when you have dependent types. Even the n-dimensional case of the CFL condition is a simple constraint you can express over a type.

Have you ever actually done any work with dependent types? I'm not sure why you would think something so basic as enforcing a function domain (which isn't the same thing as a problem domain, by the way) would be "fanfiction" otherwise. I highly recommend spending a few months actually working with them, there are plenty of good languages floating around these days.

> The correct behaviour in that context is to terminate the program

At worst it's to leave the thread of execution, which is distinct from crashing, as you asserted above and as my core point revolves around.

> Function domains are trivially enforceable through the type system when you have dependent types.

> It is not trivial to find what the correct settings will be before starting. Encountering situations like a job which runs fine for 2 days and then suddenly blows up is not particularly rare.

Somehow I doubt that the "not trivial" problem of finding correct settings before starting suddenly becomes "trivial" when you throw dependent types at it.

> (which isn't the same thing as a problem domain)

yeah bud i'm aware. I meant what I said. You're supposing that it's trivial to determine what the domain of the function in question is when the original post explicitly said otherwise. This is a falsehood about the problem domain.

> At worst it's to leave the thread of execution

Leave the thread and then do what? The stated solution to the problem, according to the original post, is to restart the program with new, manually-tweaked parameters, or to straight-up modify the code:

> The solution to that is typically changing to a numerical scheme more suited for your problem or tweaking the current scheme's parameters