Hacker News

> optimizer to insert totally random code

What are you even saying - what is your definition of "random code". FYI UB is exactly (one of) the places where an optimizer can insert optimized code.

arnsholt 3 months ago [ - ]

To take an example from the post: in some cases a value was computed that could overflow, but it was not used because of a later overflow check. I think the optimizer would be fully within its rights to delete the code inside the overflow check, because the computation implicitly asserts that it won't overflow (since overflow is undefined). I think this is a more or less useful way of thinking around UB: any operation you put in your program implicitly asserts that the values are such that UB won't happen. For example, dereferencing a pointer implicitly means it cannot be NULL, because derefing NULL is UB, and anything downstream of that deref which checks for NULL can be deleted.

flohofwoe 3 months ago [ - ]

Unfortunately UB is an umbrella term for all sorts of things, and some of those can be very harmful/unexpected, while others are (currently) harmless - but that may change in new compiler versions.

The typical optimization showcase (better code generation for signed integer loop counts) only works when the (undefined behaviour) signed integer overflow doesn't actually happen (e.g. the compiler is free to assume that the loop count won't overflow). But when the signed integer overflow happens all bets are off what will actually happen to the control flow - while that same signed integer overflow in another place may simply wrap around.

Another similar example is to specifically 'inject' UB by putting a `std::unreachable` into the default case of a switch statement. This enables an optimization that the compiler omits a range check before accessing the switch-case jump table. But if the switch-variable isn't handled in a case-branch, the jump table access may be out-of-bounds and there will be a jump to a random location.

In other situations the compiler might even be able to detect at compile time that the UB is triggered and simply generate broken code (usually optimizing away some critical part), or if you're lucky the compiler inserts an ud instruction which crashes the process.

moefh 3 months ago [ - ]

Not OP, but here's an example of "random code" inserted by the compiler[1]: note the assembly instruction "ud2" ("invalid opcode exception" in x86 land) instead of "ret" in not_ok().

You might think this code would be fine if address 0 were mapped to RAM, but both gcc and clang know it's undefined behavior to use the null pointer like that, so they add "random code" that forces a processor exception.

[1] https://godbolt.org/z/sK55YsGz1

anamexis 3 months ago [ - ]

That doesn't sound very random to me!