> thus putting a relatively bigger emphasis on code density.
> choosing "RISC purity" over code density is arguably the wrong choice
You appear to be under the incorrect impression that CISC code is more dense than RISC code.
This seems to be a common belief, apparently based on the idea that a highly variable-length ISA can be Huffman encoded, with more common operations being given shorter opcodes. This turns out not to be the case with any common CISC ISA. Rather, the simpler less flexible operations are given shorter opcodes, and that is a very different thing. A lot of the 8 bit instructions in x86 are wasted on operations that are seldom or never used and that could, even in 1976, have safely been hidden in some secondary code page.
The densest common 32 bit ISAs are Arm Thumb2 and RISC-V with the C extension. Both of them have two instruction lengths, 2 bytes and 4 bytes, as did many historical RISC or RISC-like machines including CDC6600 (15 bits and 30 bits), Cray 1, the first version of IBM 801, Berkeley RISC-II.
The idea that RISC means only a single instruction length is historically true only for ISAs introduced in the brief period between 1985 (Arm, SPARC, MIPS) and 1992 (Alpha) out of the 60 year span of RISC-like design (CDC6600 1964, the fastest supercomputer of its time). And, as an outlier, Arm64 (2011), which I think will come to be recognised as a mistake -- they thought Amd64 was the competition they had to match for code density (and they did) but failed to anticipate RISC-V.
In 64 bit, RISC-V is by far the densest ISA.
> Contemporary high performance RISC architectures (ARMv9, say) are very un-RISC in the sense of having a zillion different instructions, somewhat complex addressing modes, and so forth.
Yes, ARMv8/9-A is complex. However there is no evidence that it is higher performance than RISC-V in comparable µarches and process nodes. On the contrary, other than their lack of SIMD SiFive's U74 and P550 are faster than Arm's A53/A55 and A72, respectively. This appears to continue for more recent cores, but we don't yet have purchasable hardware to prove it. That should change in 2026, with at least Tenstorrent shipping RISC-V equivalent to Apple's M1.
ARM64 has a trick up its sleeve: many instructions that would be longer on other architecturea are instead split into easily recognisable pairs on ARM64. This allows for simple inplementations to pretend it's fixed length while more complex ones can pretend it's variable length. SVE takes this one step further with MOVPRFX, which can add be placed before almost all SVE instructions to supply masking and a third operand.
This trick is not getting ARM very far, as evidenced by its abysmal code density.
To be fair, it's a lot better than Power(PC), MIPS, SPARC, Alpha, PA-RISC, Itanium, Elbrus ...