That arithmetic shift right implementation is also what I came up with for a video game fantasy architecture that only has logical shift right. (16-bit registers)
; asr rd, rs1, rs2 ; rd = signed(rs1) >> rs2
and rt, rs1, 0x8000 ; isolate sign bit
lsr rt, rt, rs2 ; shift sign bit to final position
neg rt, rt ; sign-extended part of final result
lsr rd, rs1, rs2 ; base part of final result
or rd, rd, rt ; combine both parts
It might be easier to understand broken down this way for anyone who didn't understand the article's one-liner.