The post asks the following question, which brought out a visceral and immediate answer from me:
‘When two implementations support the same notional feature with slightly differing semantics, should the committee use undefined behavior to resolve the conflict so no users have to change code, or should the committee push for well-defined behavior despite knowing it will break some users?’
I lurched towards the second option for ‘well-defined behavior’. And I would answer that way not just despite knowing breakage, but I would say that is the correct choice even in the event of large breakages for some percentage of active code bases. I have a hard time figuring out who would choose to accept more undefined behavior for new semantic constructs.
While I may not be accepting of the authors stance and conclusions to the question of ‘Trust the Programmer’, I do agree with his thoughts about the inherent positives of full throated argument in favor of increasing semantic constructs being ‘well-‘ and ‘implementation-‘ defined.
I may even go a step further and reach for an ultimate position that if it merely degrades performance metrics by some percentage, then instances of undefined behavior should be eliminated if at all possible. At a minimum, the goal should be to move to, at the most liberal, implementation-defined behavior for any given semantic construct. This would force compiler writers to specify and particularize what syntactic and semantic constructions they were taking advantage of to generate performance gains and allow developers the ability to decide if they could adhere to the implementation’s semantic guarantees.
Although Aaron phrased it as an aside, I think this is a critical point about how "Trust the Programmer" should be construed going forward:
> I think it’s perfectly reasonable to expect “trust me, I know what I’m doing” users to have to step outside of the language to accomplish their goals and use facilities like inline assembly or implementation extensions.
This is an extremely good point: if you can "trust the programmer" to write C code which exploits undefined behavior or other fundamentally unsafe compiler-dependent features, then you should be able to trust them to write inline assembly or a compiler extension to accomplish the same goal. If they can't, then they shouldn't be mucking around with undefined behavior in C: they might understand the behavior of the compiler at a "ChatGPT level" - as a set of ad hoc if-A-then-B's - but I wouldn't trust them to make serious decisions about state and security.
> ‘When two implementations support the same notional feature with slightly differing semantics, should the committee use undefined behavior to resolve the conflict so no users have to change code, or should the committee push for well-defined behavior despite knowing it will break some users?’
That's a false dichotomy. You could also use implementation defined behaviour. Or specify that these n behaviours would be valid.
'Undefined behaviour' is too big of a sledgehammer.
This works well in some ecosystems, but the C ecosystem has a "don't pessimize my weird implementation" goal. A bunch of different platforms wrap integers at different widths (or do different things entirely). Say you want to define it. Great! You probably pick the most common case. But now there is some weird embedded system that doesn't provide this wrapping behavior in hardware and the compiler needs to emit code to implement the wrapping and now all integer operations are slower on this target. These devs are now mad!
To get the "just define it all" approach you need people on weird ecosystems to be okay with paying for it. You think it is worth paying for (and I do too, frankly). But a significant portion of both the C and C++ communities don't - and that makes this very very hard.
One important point is that it’s implementers who vote on the C standard. And they may not want to vote for a version that breaks compatibility for their users (but not for users of other, competing implementations). This is one reason why certain semantics remain undefined or implementation-defined.
Yeah. I think it's disingenuous to talk about breaking things for users, as though people are forced to use a newer language standard.
C99 "broke" implicit declarations, but few if any people were forced to use C99 and it never became the default in, say, GCC (-std=gnu11 became the default in GCC 5.5, released in 2017).
Agreed, the biggest concern with this point of view is that the developer then has to ensure the older version of the compiler stays functional as OS’s and execution environments progress. That may be a reach, but I think one of the heavy imperatives of going to a more defined standard for the semantics of C would be forcing compiler implementations to be very clear about what standard they are supporting and what kind of guarantees they are making about support timeframes for that standard.
The above is because I would hope that one result of pushing a nearly fully ‘defined’ (well or implementation) standard would be a strong interconnectedness and compositionality of semantics between all semantic constructions. This should mean compiler implementers can not just fall back on a mish-mash of standards compliance and then claim undefined behavior lets them just omit certain semantic constructs. I would like to think having the language be very clearly defined would almost require a complete adoption of some given standard to ensure the compiler was compliant.
I am aware the possibility of such a radical realignment of C’s structure is nearly impossible, but if C can not or will not do it, there may be the option for an incredibly similar language to piggy back it’s way to common use. This may also satisfy some of the arguments/positions in TFA concerning ‘Trust the Programmer’, where this superseding language can ‘unsafe/non-conforming’ out to C directly in C syntax in the event a non-conforming semantic construction is needed or desired by a developer.