> In 64-bit code there is very little reason at all to be using bits 15:8 of a longer register.
I disagree: there only exists BSWAP r32 (and by 64 extension BSWAP r64): https://www.felixcloutier.com/x86/bswap
No BSWAP r16 exists. Why? in 32 bit mode, it was not needed, because you could simply use
XCHG r/m8, r8
with, say, cl and ch (to swap the endianness of cx).
In 64 bit mode, you can thus only the endianness of a 16 bit value for the "old" registers ax, cx, dx, bx using one instruction. If you want to swap the 16 bit part of one of the "new" registers, you add least have to do a 32 bit (logical) right shift (SHL) after a BSWAP r32 (EDIT: jstarks pointed out that you could also use ROL r/m16, 8 to do this in one instruction on x86-64). By the way: this solution has a pitfall over BSWAP: BSWAP preserves the flags register, while SHL does not.
What about ROL r/m16, 8?
This would indeed work (and is likely the better solution), but in opposite to BSWAP and XCHG, it also changes flags.