It took me a while to understand that internally it uses 128bit numbers, that `>> 64` in the pseudocode was super confusing until I saw the C++ code.
Neat code though!
It took me a while to understand that internally it uses 128bit numbers, that `>> 64` in the pseudocode was super confusing until I saw the C++ code.
Neat code though!
Not really. It looks like that in the C code, but in the generated machine code it'll just be a single `MULH` instruction giving (only) the upper 64 bits of the result, no shift needed.