Hacker News

> It’d be reasonable to think that this will have a runtime cost, however it doesn’t. The reason is that the Rust standard library has a nice optimisation in it that when we consume a Vec and collect the result into a new Vec, in many circumstances, the heap allocation of the original Vec can be reused. This applies in this case. But what even with the heap allocation being reused, we’re still looping over all the elements to transform them right? Because the in-memory representation of an AtomicSymbolId is identical to that of a SymbolId, our loop becomes a no-op and is optimised away.

Those optimisations that this code relies on are literally undefined behaviour. The compiler doesn't guarantee it's gonna apply those optimisations. So your code might suddenly become super slow and you'll have to go digging in to see why. Is this undefined behaviour better than just having an unsafe block? I'm not so sure. The unsafe code will be easier to read and you won't need any comments or a blog to explain why we're doing voodoo stuff because the logic of the code will explain its intentions.

steveklabnik 3 days ago [ - ]

> Those optimisations that this code relies on are literally undefined behaviour.

You cannot get undefined behavior in Rust without an unsafe block.

> The compiler doesn't guarantee it's gonna apply those optimisations.

This is a different concept than UB.

However, for the "heap allocation can be re-used", Rust does talk about this: https://doc.rust-lang.org/stable/std/vec/struct.Vec.html#imp...

It cannot guarantee it for arbitrary iterators, but the map().collect() re-use is well known, and the machinery is there to do this, so while other implementations may not, rustc always will.

Basically, it is implementation-defined behavior. (If it were C/C++ it would be 'unspecified behavior' because rustc does not document exactly when it does this, but this is a very fine nitpick and not language Rust currently uses, though I'd argue it should.)

> So your code might suddenly become super slow and you'll have to go digging in to see why.

That's why wild has performance tests, to ensure that if a change breaks rustc's ability to optimize, it'll be noticed, and therefore fixed.

0x1ceb00da 3 days ago [ - ]

> That's why wild has performance tests, to ensure that if a change breaks rustc's ability to optimize, it'll be noticed, and therefore fixed.

But benchmarks won't tell us which optimisation suddenly stopped working. This looks so similar to the argument against UB to me. Something breaks, but you don't know what, where, and why.

steveklabnik 3 days ago [ - ]

It is true that it won't tell you, for sure. It's just that UB means something very specific when discussing language semantics.

0x1ceb00da 3 days ago [ - ]

I see. These optimisations might not be UB as understood in compiler lingo, but it is a kind of "undefined behaviour", as in anything could happen. And honestly the problems it might cause don't look that different from those caused by UB (from compiler lingo). Not to mention, using unsafe for writing optimised code will generate same-ish code in both debug and release mode, so DX will be better too.

tuckerman 3 days ago [ - ]

As an example, parts of the C++ standard library (none of the core language I believe though) are covered by complexity requirements but implementations can still vary widely, e.g. std::sort needs to be linearithmic but someone could still implement a very slow version without it being UB (even if it was quadratic or something it still wouldn't be UB but wouldn't be standards conforming).

UB is really about the observable behavior of the abstract machine which is limited to the reads/writes to volatile data and I/O library calls [1]

[1] http://open-std.org/jtc1/sc22/open/n2356/intro.html

Edit: to clarify the example

tialaramex 3 days ago [ - ]

I understand why Alexander Stepanov thought the complexity requirements were a good idea, but I am not convinced that in practice this delivers value. Worse, I don't see much sign C++ programmers care.

You mentioned particularly the C++ unstable sort std::sort. Famously although C++ 11 finally guarantees O(n log n) worst case complexity the libc++ stdlib didn't conform. They'd shipped worst case O(n squared) instead.

The bug report saying essentially "Hey, your sort is defective", was opened in 2014. By Orson Peters. It took until 2021 to fix it.

Ar-Curunir 3 days ago [ - ]

The optimization not getting applied doesn't mean that "anything could happen". Your code would just run slower. The result of this computation would still be correct and would match what you would expect to happen. This is the opposite of undefined behaviour, where the result is literally undefined, and, in particular, can be garbage.

stouset 3 days ago [ - ]

You're misusing the term "undefined behavior". You can certainly say that these kinds of performance optimizations aren't guaranteed.

3 days ago [ - ]

[deleted]